truncated seq IDs after filter?


I was trying to run through the analysis at:

I noticed that after I ran the filter command, the fasta file that resulted had sequence IDs that were missing the first letter. E.g. in the original fasta file, we had IDs like:


In the fasta file that I get after running the filter command (i.e. in mothur: filter.seqs(fasta=DOK03.aligned), the above IDs appear as:


I was working with the sequence that I downloaded from the link given on the webpage above. Specifically:

(This is just a report to let you know about this. )



I am also experiencing this problem with the aligner, but only when using MPI. I have been working around this by prepending a non-space character to the identifiers prior to alignment. See accession numbers truncated for details.

On Linux, I use the vi editor to do a simple substitution of ‘>’ for ‘>_’ at the beginning of the FASTA headers:

vi sequences.fasta

Thanks for reporting this bug. The fix will be part of 1.13.0.