How to do OTU analysis for just SOME groups?

I have sequences from 17 different treatments, and I’m only interested in doing OTU based analysis 16 of them (the 17th is “this treatment is screwed up, ignore it”).

I have a groups file to define which sequences go in which groups.

How do I:

generate a fasta file with just the sequences in the groups I want?
get a distance matrix or alignment with just the groups I want?
get a .list file with just the groups I want?

Sorry if this is obvious.

I’m not sure this question was clear, let me try again.

I have 16 groups that matter (W1 … W16) and one that does not (IGNORE). I want to find sequences from OTUs that all 16 good groups share, ignoring sequences from the “bad” group.

OTUs are represented in mothur files by their “OTU number” (is this is an index into the .list file?). There are 11410 OTUs (and hence OTU numbers) using all groups in my dataset (good AND bad).

Without the “IGNORE” group, there are 7990 OTUs (I see this in the .shared file). The “.shared” file was generated with a groups= parameter, and so it only has the 7990 non-IGORE groups.

How do I determine which groups in the “.otu” file these 7990 are and how do I get otu representatives for them?

Have you tried using get.oturep to get a representative sequence for the 7990 OTUs?

If you can generate a fasta file with just the sequences belonging to the groups you want, your other questions (in your first post) should subsequently be solved.

Are you able to easily get the names of the sequences you don’t want? If so, use the remove.seqs function. For this: remove.seqs(accnos=your.sequence.names.you.don’t.wan’t.accnos, fasta=file.containing.all.sequences.fasta)

this will produce a .pick file that will have the remaining sequences

nice idea. Thanks!