Lots of unique seqs and too few chimeras?

Lethe · March 2, 2018, 9:43am

Dear mothur creators,

I am not sure where to post this, so I will try here.
I would like your opinion on the problem (if it can be called so) that I’ve encountered.

I have ~200 samples, V4, region. Unfortunately, sequenced with 150 bp paired-end. I have always worked with 250 paired-end reads, and although make.contigs with complete overlap and a subsequent qc still resulted in lots of unique sequences for environmental and marine invertebrate microbiota, I usually ended up having reasonably few unique sequences when it comes to human gut microbiota.

This time, even after precluster, I ended up with cca 400000 unique sequences. But, what puzzles me more, is that very few of them (max 2-3%,even after fiddling with parameters to increase the sensitivity) turn out to be chimeras, regardless to the method used. I don’t say it’s wrong, but I do find it strange, as I’ ve always had considerably higher proportion of unique sequences marked as potentially chimeric.

What do you think about this? Is it normal to have so few chimeras?
Thanks in advance for your opinion,
kind regards,
L

EDIT: So I suppose the problem is simply that because with the short overlap, I am simply left with two many errors. I never remove singletons from my analysis, but do you think it is reasonable to do it this time (before chimera removal)?

pschloss · March 5, 2018, 9:06pm

Chimera detection can be impacted by sequencing errors, so that is a possibility. Other than that, I’m not really sure what to suggest. What percentage of all of your sequences (counting duplicates) are chimeras?

Pat

Lethe · April 9, 2018, 1:23pm

I am really sorry for a very late reply, I’ve been working on something completely different during the last month.
For vsearch, if I run it without groups, it is about 10%, and that with cutoff =0.1.
I know there is no really good solution to this. It is also not that I want my sequences to be chimeras if they are not :D, but based on my previous experience, the proportion seems low.
Do you think that being strict in this case (I mean with cutoff) can at least a bit alleviate the problem? At least in a sense it will keep false negative rate as low as possible and reduce a bit a dataset? Or it won’t matter anyway, cos I don’t have a good overlap, and my seqs are simply erroneus
Thanks for your time and as well for the new mothur version.

Topic		Replies	Views
Chimera.vsearch results for 18S v9 Theory behind mothur	2	481	July 7, 2019
Chimera.uchime high rate of false positives mothur bugs	2	2702	November 19, 2012
Removing chimeric sequences Theory behind mothur	1	5104	May 20, 2011
Dealing with chimeras Theory behind mothur	5	10425	March 5, 2012
chimera.slayer detects "too many" chimeras Commands in mothur	3	4083	May 13, 2011

Lots of unique seqs and too few chimeras?

Related topics