Greetings,
Using the latest MOTHUR (v.1.9.0) I keep running into a problem with the screen.seqs command.
Whenever I try to cull sequences that do not meet certain criteria I get the message “Your groupfile does not include the sequence … please correct”
Indeed, when I check the resulting “good.group” file, those sequences have not been culled although many other sequences that did not meet those criteria were culled and ended up in the “bad.group” as they should.
Similar things happen when I use the screen.seqs tool on aligned fastas… I get the message “Your namefile does not include the sequence…”
Could this be a bug or is it my input files?
I was wondering if anybody has suggestions.
Best,
Guus Roeselers
Example:
mothur > screen.seqs(fasta=BATCH1.fasta, group=BATCH1.group, minlength=200, maxlength=280, maxhomop=10)
Your groupfile does not include the sequence GCVP7QP02B34BT please correct.
Your groupfile does not include the sequence GCVP7QP02B9IAU please correct.
Your groupfile does not include the sequence GCVP7QP02BQ1MQ please correct.
Your groupfile does not include the sequence GCVP7QP02BZIZ3 please correct.
Your groupfile does not include the sequence GCVP7QP06HE78K please correct.
From lines 14646 through 178567 of your groups file all of the sequence names begin with a “>” character. If you remove all of these screen.seqs runs without a problem. It looks like these were generated using 454. Although it isn’t a problem to make your own groups file for 454 data, sometimes it might just be easier to use the trim.seqs command and have us do it for you :).
If you have opened your group file and the sequence does appear to be in the file, it could be an extra spacing issue or perhaps and duplicate line issue. To try and troubleshoot it you could try the following:
mothur > set.dir(debug=t)
mothur > get.groups(group=yourGroupFile, groups=OneOfYourGroups) - this will force a read of your group will with extra output from mothur
Thanks for your reply. It seems to give me a huge list of these sequences that are not in my groups file. I tried to start from the very beginning again and now I’m getting stuck at the summary.seqs command after trim.seqs. It tells me: