OTU's and read length

ajone · April 19, 2010, 7:53pm

In several studies using 454 pyrysequencing, one can read that that reads below e.g. 200 bp. were filtred out, leaving only reads above 200 bp. for further processing. This means that one have reads of different length. How can one assign otu’s for sequences that differs in length? E.g. the only difference between two reads is the length of the reads, not a real difference in sequence. Is there a way to overcome this problem without reducing the length of all sequences to 200 bp.

pschloss · April 19, 2010, 9:05pm

Yup. All of the sequences need to be trimmed so they overlap over the same region. This is because the 16S gene does not evolve evenly over its length. So having some sequences be longer than others could involve adding more or less variable sites and skewing the output. I generally try to go for a length where I am able to keep 95% of the sequences. By the way, this advice holds for phylotyping and OTUing.

ajone · April 20, 2010, 6:58am

Thanks for the response. So as I see it, many studies overestimate the number of OTU’s because they compare reads at different length, right? Regarding your analysis of the Sogin and Costello data, you do not take this problem into account for the Sogin data, while you have dealt with the problem in the Costello analysis using the screen.seqs command, correct?

pschloss · April 20, 2010, 11:37am

I suspect in the 5’ end and the V6 you could inflate the number of OTUs and at the 3’ end you could deflate them - but I haven’t done the experiment yet. You are right about how the example pages were done, I/someone should probably go back and trim the Sogin dataset and redo the analysis.

Thanks for your questions!
Pat

Topic		Replies	Views
problem of OTUs mothur bugs	1	2792	May 24, 2010
align.seqs Commands in mothur	6	5404	August 28, 2010
Sequences are not the same length Commands in mothur	3	4207	July 8, 2011
Choosing length after first command line Commands in mothur	4	3018	September 15, 2014
get.oturep Theory behind mothur	3	7973	February 14, 2011

OTU's and read length

Related topics