truncated seq IDs after filter?

Hello,

I was trying to run through the analysis at:

http://www.mothur.org/wiki/Agricultural_soil_community_analysis

I noticed that after I ran the filter command, the fasta file that resulted had sequence IDs that were missing the first letter. E.g. in the original fasta file, we had IDs like:

DQ829627
DQ829626

In the fasta file that I get after running the filter command (i.e. in mothur: filter.seqs(fasta=DOK03.aligned), the above IDs appear as:

Q829627
Q829626

I was working with the sequence that I downloaded from the link given on the webpage above. Specifically:

http://www.mothur.org/w/images/c/c4/DOK03.zip

(This is just a report to let you know about this. )

cheers,

Bela

I am also experiencing this problem with the aligner, but only when using MPI. I have been working around this by prepending a non-space character to the identifiers prior to alignment. See accession numbers truncated for details.

On Linux, I use the vi editor to do a simple substitution of ‘>’ for ‘>_’ at the beginning of the FASTA headers:

vi sequences.fasta
:%s/^>/>_/g

Thanks for reporting this bug. The fix will be part of 1.13.0.