Pre.cluster set diff for 95% nucleotide identity

Downunder · December 12, 2020, 5:22am

Hi there,

I am new to microbiome analysis and mothur, and I am trying to analyse sequencing data of dnaK, a conserved core gene, rather than 16S.
We also sequenced the 16S and I follow the MiSeqSop tutorial, then I tried to apply the same commands to dnaK but with altering parameters because dnaK is less conserved than 16S.

dnaK should be able to assign species more accurately than 16S, theoretically with an average nucleotide identity > 95%, given that bacterial strains are usually considered of the same species if they have an average nucleotide identity > 95%.

So to my question, I would like to pre.cluster my reads based on this 95% identity and then classify.seqs them (I have built my own database of dnaKs). In this way I can have an idea of how many reads belong to my genus of interest and how many reads belongs to the known and unkown species, and maybe how many belongs to undescribed species.

with some reading in the forum, I came up with diff=7, because my amplicon is 295 and the 5% of 295 is 14.

is that correct?
how should I set diff= for allowing clustering as same species sequence with 95% identity?

or should I do this later following the tutorial with classify.otu with label=0.05.

Thank you for your help!

pschloss · December 14, 2020, 6:12pm

Hi there,

I would suggest using diffs=3 with pre.cluster and then use cluster to a cutoff of 0.05.

Pat

system · December 24, 2020, 6:12pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
similarity between sequences Commands in mothur	4	3408	August 2, 2012
16S and 18S Sequence Mix for analysis!? Commands in mothur	3	2907	June 1, 2015
aggregating data Commands in mothur	5	1086	December 9, 2016
Species-level Identification via Classify.Seqs KNN Commands in mothur	3	7010	November 29, 2014
Unique nseq & a lot of "Bacteria; unlcassified" Commands in mothur	1	2373	March 30, 2015

Pre.cluster set diff for 95% nucleotide identity

Related Topics