Hello! I have a pretty massive dataset that I am working with, and I suspect that the dist.seqs
step will take many hours to run. I want to run this step as a batch job, however, I have mothur installed in a conda environment. Is there a way to run a mothur batch job from within a conda environment?
Absolutely. mothur is available in conda. You’d run mothur from the command line either feeding it a batch file or by sending it the command you want to run…
(my_env) $ mothur "#dist.seqs(fasta=my_sequences.fasta)"
Also, we rarely run dist.seqs
and instead use cluster.split
by giving it our fasta and taxonomy files. This significantly reduces the number of distances you’d have to calculate and then allows you to parallelize the clustering step.
Pat
This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.