Followed the SOP to this point, got another error message. Seems like I lost some sequences in the name file compare to the group file.
I have no idea which step might be gone wrong. Please help.
[ERROR]: Your name file contains 113588 valid sequences, and your groupfile contains 115543, please correct.
[ERROR]: process 0 only processed 1 of 5 groups assigned to it, quitting.
/******************************************/
Running command: unique.seqs(fasta=s3.shhh.trim.unique.good.filter.unique.precluster.fasta, name=s3.shhh.trim.unique.good.filter.unique.precluster.names)
[ERROR]: s3.shhh.trim.unique.good.filter.unique.precluster.fasta is blank, aborting.
Using s3.shhh.trim.unique.good.filter.unique.fasta as input file for the fasta parameter.
[ERROR]: s3.shhh.trim.unique.good.filter.unique.precluster.names is blank, aborting.[/b]
I started doing this a while back (before mothur used count tables) so I don’t know if it’s necessary anymore, although if you’ve encountered this problem then this might be a helpful addition to your pipeline. Typically I’ll check that there are actually sequence names in the bad_seqs.accnos file before I run remove.seqs because sometimes it’s not necessary. And if you’re working on Windows you’ll need to use ‘find’ instead of ‘grep’.
When the number of bases in the aligned sequence falls below 50% of the original number of bases and flip=t, mothur will try to align the reverse compliment of the sequence. When both the reverse and the forward sequence alignments result in more than a 50% reduction in the number of bases (a poor alignment) mothur reports it in the file. You can easy remove these sequences by setting the minlength parameter in the screen.seqs command. You can also adjust the sensitivity of the flip threshold (the 50% number) in the align.seqs command with the threshold parameter. For example, threshold=0.60 would indicate 60%.
mothur > list.seqs(group=s3.shhh.good.groups) - list the sequences in your group file
mothur > get.seqs(fasta=s3.shhh.trim.unique.good.filter.unique.fasta, name=s3.shhh.trim.unique.good.filter.names, dups=false) - select only the names in the group file.