get.oturep and renaming of accessions

The get.oturep function is great for producing sequence data subsets appropriate for building consensus trees or other analyses (e.g. a properly 97% sequence identity dereplicated dataset). Unfortunately, the fasta output (filename.fn.label.rep.fasta) renames each sequence, preventing the data from being tracked in mothur in subsequent analysis (for example, I am trying to build a tree from the consensus representative sequences and input that back into weighted UNIFRAC within mothur which demands matching my .names and .groups files).

My request is that an option be given for two fasta files to be output, one with the original sequence accessions to be tracked in subsequent analyses (Via a .names file) and another containing the longer accessions currently output which detail OTU counts and group membership.

Thank you.

Would it be ok if on the “>” line you had the following…

sequenceNameextra information

???

Yes. I think so. Anything that can easily be automatically parsed out either within mothur or in a batch routine would be helpful, and I think that a would work.

Thanks,
Craig

This change will be part of 1.12.1 releasing later today.