get.oturep and renaming of accessions

nelsoncraige · July 20, 2010, 11:16pm

The get.oturep function is great for producing sequence data subsets appropriate for building consensus trees or other analyses (e.g. a properly 97% sequence identity dereplicated dataset). Unfortunately, the fasta output (filename.fn.label.rep.fasta) renames each sequence, preventing the data from being tracked in mothur in subsequent analysis (for example, I am trying to build a tree from the consensus representative sequences and input that back into weighted UNIFRAC within mothur which demands matching my .names and .groups files).

My request is that an option be given for two fasta files to be output, one with the original sequence accessions to be tracked in subsequent analyses (Via a .names file) and another containing the longer accessions currently output which detail OTU counts and group membership.

Thank you.

pschloss · July 25, 2010, 1:53pm

Would it be ok if on the “>” line you had the following…

sequenceNameextra information

???

nelsoncraige · July 25, 2010, 11:05pm

Yes. I think so. Anything that can easily be automatically parsed out either within mothur or in a batch routine would be helpful, and I think that a would work.

Thanks,
Craig

westcott · July 29, 2010, 11:12am

This change will be part of 1.12.1 releasing later today.

Topic		Replies	Views
renaming sequences in get.oturep Feature requests granted	0	4331	July 29, 2010
question about get.oturep Commands in mothur	1	1444	June 1, 2015
get.oturep: no fasta file? Commands in mothur	4	4286	February 17, 2012
Representative OTU Seqs in Multisample Analyses Commands in mothur	1	3149	June 24, 2010
link OTU to every sequence Commands in mothur	1	2276	August 2, 2013

get.oturep and renaming of accessions

Related topics