mothur

align.seqs and summary.seqs issue


#1

After alignment with Silva alignment file modified with my oligos I ran the count and summary.seqs commands and received the following
error message after the summary.seqs command:

[ERROR]: ‘M03580_84_000000000-BWWYH_1_2118_27365_10822’ is not in your name or count file, please correct.
[ERROR]: Your count file contains 3612909 unique sequences, but your fasta file contains 2. File mismatch detected, quitting command.

Here are the commands I used:

align.seqs(fasta=oil_sed.trim.contigs.good.unique.fasta, reference=silva.nr_v132.pcr.align, flip=T)
count.seqs(name=oil_sed.trim.contigs.good.names, group=oil_sed.contigs.good.groups)
summary.seqs(fasta=oil_sed.trim.contigs.good.unique.align, count=oil_sed.trim.contigs.good.count_table, processors=24)

#2

Did the align.seqs command complete? I suspect it did not, perhaps due to the size of the reference combined with the number of processors. The more processors are used the more memory is required. If the align.seqs command did complete, could you send your logfile and input files to mothur.bugs@gmail.com?


#3

The align.seqs command did complete. Since I posted this I tried running align.seqs with only 1 processor because I had problems in the past using multiple. I had initially designated 4 processors for the align.seqs command and 24 processors for the summary when I posted my issue. When I ran the
align.seqs command again I didn’t include the processors at all thinking that it would default to 1, but instead it defaulted to 56. I got the same error even though the process completed for that one as well. I will send my logfile and input files to the email you provided. Thanks!


#4

Update on issue:

Sarah Wescott ran align.seqs and summary.seqs with my files and was able to get the alignment file and summary file without a hitch.

The issue as it turns out, was that I was using the count.table in the summary.seqs after alignment. This revealed that the issue actually lie in my .groups file output from make.contigs which had 16 million groups instead of the expected 129. Now it would seem, I must troubleshoot the make.contigs step.