truncated seq IDs after filter?

btiwari · June 23, 2010, 3:02pm

Hello,

I was trying to run through the analysis at:

http://www.mothur.org/wiki/Agricultural_soil_community_analysis

I noticed that after I ran the filter command, the fasta file that resulted had sequence IDs that were missing the first letter. E.g. in the original fasta file, we had IDs like:

DQ829627
DQ829626

In the fasta file that I get after running the filter command (i.e. in mothur: filter.seqs(fasta=DOK03.aligned), the above IDs appear as:

Q829627
Q829626

I was working with the sequence that I downloaded from the link given on the webpage above. Specifically:

http://www.mothur.org/w/images/c/c4/DOK03.zip

(This is just a report to let you know about this. )

cheers,

Bela

ctparker · August 11, 2010, 1:35pm

I am also experiencing this problem with the aligner, but only when using MPI. I have been working around this by prepending a non-space character to the identifiers prior to alignment. See accession numbers truncated for details.

On Linux, I use the vi editor to do a simple substitution of ‘>’ for ‘>_’ at the beginning of the FASTA headers:

vi sequences.fasta
:%s/^>/>_/g

westcott · August 18, 2010, 10:16am

Thanks for reporting this bug. The fix will be part of 1.13.0.

Topic		Replies	Views
accession numbers truncated mothur bugs	1	4087	August 18, 2010
truncated seqs ID in Mothur 1.13.0 when running align.seqs mothur bugs	2	3330	September 17, 2010
filter.seqs - potential bug? mothur bugs	7	8851	December 1, 2014
filter.seqs : Length of filtered alignment problem Commands in mothur	4	3056	December 17, 2021
problems with filter.seqs Commands in mothur	3	2160	March 26, 2015

truncated seq IDs after filter?

Related topics