OTUs & unique.seqs

joomanji · September 20, 2013, 6:47am

Hi all,

I am following Schloss’s MiSeq SOP to process my 16S V6 region Illumina data with some modification. However, I would like to understand how the unique.seqs work. Since the V6 region generate from illumina HiSeq is pretty short (90bp), the chances of getting the exact sequence is very high. If I apply unique.seqs to remove all the “replicate” reads, will this affect the OTU calculation and generation of rarefaction curve plots?

Thanks!

westcott · September 20, 2013, 11:41am

The unique.seqs command creates a names file that includes the duplicates. When you run the cluster command be sure to include the names file, cluster(column=yourDistanceMatrix, name=yourNameFIle). Including the names file ensures that the duplicates are included in the OTUs and the rest of the downstream analysis.

joomanji · September 25, 2013, 10:38am

Thanks westcott for your input. Now I have a better understanding how the name file works. However I do have another question regarding the OTU calculation. I subject the unique sequences to clustering using the pre.cluster and cluster command.
In my datasets, I am getting around 1,600+ unique sequence out of original 25,000 sequences after the clustering process, but when i continue to get the rarefaction curve, the rarefaction curve plotted with more than 9000+ unique sequences/OTU based on unique cut off. To my understanding the sequences that can be clustered should be consider as one OTU, I am wondering why I am still getting a very steep rarefaction curve?

Topic		Replies	Views
unique.seqs & abundance Commands in mothur	11	9543	May 3, 2013
Clarity on Cluster .names inclusion Commands in mothur	3	2863	October 17, 2012
eye rolling "unique" question Theory behind mothur	5	6653	September 2, 2014
OTU clusters Commands in mothur	1	1975	August 29, 2013
cluster() function only gives "unique" results Commands in mothur	1	2434	June 7, 2011

OTUs & unique.seqs

Related topics