I am trying to get representative sequences for each OTU in my dataset by running the get.oturep command. The distance file that I am using is the one generated through dist.shared command (thetayc method).
The rep.fasta output that I get is confusing me. The first representative sequence is from the most abundant OTU1:
Am I right in understanding that 24534 is the number of sequences belonging to that OTU? This is where I have a problem: the total number of sequences assigned to that OTU in the cons.taxonomy file is 162625 (much larger). Is there something I am doing wrong? Or does the number 162625 include non-unique sequences and the 24534 number corresponds to unique sequences?
Thanks very much,