Is there a way to get the original names of all the reads for a given OTU from a given sample?
I’ve followed the Schloss SOP to identify OTUs in a 454 data set on 16S bacterial diversity
I’ve used the final.an.SAMPLENAME.list file to get the names of all the unique reads for a given OTU from this sample but this is only a fraction of the total number of reads as unique reads have been identified at different steps in the analysis. I’m however trying to look for the distribution of all the reads between these unique reads.
Can someone please help me with a way to do this?
Thanks in advance,
The list file should contain the names of all the seqs, not just the uniques, unless you clustered using a count table file. When you run the cluster command you want to be sure to include the names file, because the average neighbor clustering method takes into account the number of sequences in each OTU. This forum post contains a simple example, http://www.wiki.mothur.org/forum/viewtopic.php?f=3&t=1984&p=5571. Including the names file also insures the redundant names get added back into your list file.
Ah, it was because I used a downstream fasta file that did not have the missing sequence names…