With respect to the mothur pipelines/workflow around the Cluster.split command/process. in a Pacbio SOP.
The cluster.split (fasta= ...precluster.pick.pick.fasta,
count= .. denovo.vsearch.pick.pick.count_table,
splitmethod=classify, taxlevel=4, cutoff=0.03)
I expect to get “3” output files (dist, list, sensspec) created in that order.
All good so far?
So my question is do I need wait for the sens.spec() part of the cluster.split()
to complete. Once I have the *.list file can just use to move on to the
make.shared(), classify.otu(), etc, etc.
The sens.spec() takes two hours or so to run to completion.
The make.shared(), classify.otu, count.groups steps all together take less than
Is that KOSHER?
Also when I run the same pipeline/workflow same data on different processor counts I see some variation in sensspec output is that normal? How much variation should I see between runs? I found this when I was benchmarking mothur for a new server. It has been a while since I used mothur (had last used the MPI version).
What is actually happening (analysis wise) in the sens.spec()?