I’m trying to run chimera.slayer() for comparison with a new chimera-finding algorithm I’m working on. It appears chimera.slayer() uses blast search by default, contrary to the wiki manpage that says “Choices [for search=] are distance, blast, and kmer, default distance”. Or, is the ‘distance’ method also blast-based?
What are the recommended default parameters? (it’s not clear if this could be a bug in the code or documentation). If blast is the intended default, is there a particular NCBI version I should install, or should I just go with the latest blast+ binaries?
It seems to work ok with search=kmer.
If I try search=distance, I get this error: “distance is not a valid search”.
Example commands & errors follow. I’m using v.1.36.1 under Ubuntu Linux [sic, despite the .exe’s in the error messages].
mothur “#chimera.slayer(fasta=silva.gold.align, template=silva.gold.align)”
[ERROR]: /home/bob/bin/blast/bin/formatdb file does not exist. mothur requires formatdb.exe.
[ERROR]: /home/bob/bin/blast/bin/blastall file does not exist. mothur requires blastall.exe.
[ERROR]: /home/bob/bin/blast/bin/megablast file does not exist. mothur requires megablast.exe.
mothur “#chimera.slayer(fasta=silva.gold.align, template=silva.gold.align, search=distance)”
Error: distance is not a valid search.
The search options for chimera.slayer are blast and kmer, sorry for any confusion from past documentation. By default mothur will look for the blast executables it needs in the location of mothur’s executable. Looking at your output it appears mothur is located in /home/bob/bin/. You can either use the blastlocation parameter to tell mothur where you have blast installed on your machine, or move blast to mothur’s location.
That worked great, thanks!
Follow-up question – can you suggest how to implement “leave-one-out” (l-o-o) false positive testing for the chimera.slayer() command? In the UCHIME paper we tested for false positives by doing a l-o-o test with each sequence in the gold database as query and the remaining sequences as the ref db. The -self option of UCHIME supports l-o-o by telling UCHIME to ignore a hit to a database sequence with the same label as the query, so the whole test can be done in one run without modifying the database explicitly.
Unless mothur has a similar option that I missed (which would be great!), it will be very expensive to implement because I’ll have to make 5,000 different reference databases by deleting one sequence at a time from silva.gold.align. Is there an option I missed which works like -self, or can you point me where in the source code I could hack one for myself?