Understanding the rep.fasta file generated through the get.oturep command

olga · May 27, 2021, 9:34am

Hello,

I am trying to get representative sequences for each OTU in my dataset by running the get.oturep command. The distance file that I am using is the one generated through dist.shared command (thetayc method).
The rep.fasta output that I get is confusing me. The first representative sequence is from the most abundant OTU1:

MISEQ_60_000000000-CVFVM_1_1101_10014_17238 Otu001|24534
Am I right in understanding that 24534 is the number of sequences belonging to that OTU? This is where I have a problem: the total number of sequences assigned to that OTU in the cons.taxonomy file is 162625 (much larger). Is there something I am doing wrong? Or does the number 162625 include non-unique sequences and the 24534 number corresponds to unique sequences?

Thanks very much,
-Olga-

sje062 · May 28, 2021, 5:37am

Hi Olga,

yes, your suggestion can be right. Depends on whether all or just unique sequnces used to generate representative OTUs. See:

With the name file included representative OTU should be picked from all sequences (as I rember). Run summary.seqs to count sequences. This is just a quick reply. You can ask again if this does not help.

Sigmund

system · June 7, 2021, 5:37am

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Representative OTU Seqs in Multisample Analyses Commands in mothur	1	3163	June 24, 2010
get.oturep to get representative sequences for each OTU Commands in mothur	2	2509	March 29, 2016
out rep seq without distance Commands in mothur	2	780	July 3, 2017
get.oturep Commands in mothur	1	2535	April 13, 2011
get.oturep: no fasta file? Commands in mothur	4	4300	February 17, 2012

Understanding the rep.fasta file generated through the get.oturep command

Related topics