Too much OTUs

sebasdiazz · October 7, 2015, 8:51pm

Hi, Mothur community!

I am working with Miseq data from gut communities from 6 insect species. From my original dataset of 4’105.742 after trim short, nucleotide ambiguity, unaligned, non-bacterial and chimeric sequences, my total number of sequences are 2’062.377 and after unique.seqs and pre.cluster of 2 bp, 640.944 unique sequences. For every species I had something like 300.000 sequences and 90.000 unique sequences, and in the clustering at 97%, I found more than 200 OTUs per sample some even with 1000!!! (and you know for insect gut I must be something like 20-30 OTUs), with most of the OTUs as singletons and underrepresented OTUs (less than 10 seqs), so I guess I have a lot of spurious reads.

I want to back in my pre-processing steps, to try to identified the reads with sequencing errors. I changed in the pre.cluster the threshold from 2 to 4 (that will represent a error in the 1-2% of the sequence length of 427 bp), and now I have 332.577 unique sequences. Also, using split.abund with cutoff=1, I found 314.161 singletons (7.6% of my original dataset), what I still think that is a high number.

What do you recommend? Try with a higher pre.cluster or maybe eliminate the singletons or other alternative?

Thank you,
Sebastián

pschloss · October 12, 2015, 3:09pm

I suspect you aren’t sequencing the V4 region with paired 250 nt reads, right? See this…

http://blog.mothur.org/2014/09/11/Why-such-a-large-distance-matrix%3F/

Topic		Replies	Views
Are my number of sequences and OTU weird? Theory behind mothur	11	2374	November 21, 2016
OTUs number too high Theory behind mothur	7	8608	January 26, 2016
Unique nseq & a lot of "Bacteria; unlcassified" Commands in mothur	1	2430	March 30, 2015
Too many unique sequences before cluster.seqs	1	796	February 23, 2021
Number of OTUs in "shared" file #637	3	537	July 2, 2019

Too much OTUs

Related topics