filter.seqs fails with processors option

Hi Pat & co.,

thanks for developing and maintaining mothur!

Today we experienced a problem in filter.seqs. A medium sized dataset of roughly 400.000 sequences should be filtered and I have added the option processors=4 on a 4-core-machine. After a while mothur failed:

88400
88472
Sequences are not all the same length, please correct.

The alignment was produced by mothur before and was valid as well as the filter. The same command worked fine without the processors option.

best regards
Thomas

What version of mothur are your using?

Oh sorry, I forgot to mention that we’re using 1.11.0

Best,
Thomas

Could you send your logfile, fasta file and filter file to mothur.bugs@gmail.com?

Sure. The mail including the download links to the data has just been sent. There is also a description of a similar bug in classify.seqs.

best,
Thomas

Hi Thomas,
Thanks for sending your files. I was able to track down the problem. When mothur uses multiple processors, we divide the files into pieces and the different processors each complete a piece. We store the starting positions in the file for each process. The position for 4 processes was too large to fit in the data type we used. This will be changed for version 1.12.0. Thanks for your help in tracking down the bug.
-Sarah

Hi Sarah,

great that you could find and eliminate the bug so quickly. Thanks a lot!

Best regards
Thomas