Possible to Combine Cluster.Split Output Files

A bit of an unorthodox question, but curious if there’s a work-around.

I am running cluster.split with commands identical to these (with my file names substituted).

cluster.split(fasta=final.fasta, name=final.names, taxonomy=final.taxonomy, taxlevel=4, cluster=f, processors=8)
cluster.split(file=final.file, processors=2)

I was able to generate the distance matrix (it’s huge, but I got it) and the Files file. However, I am running into RAM issues on the second cluster.split command. It generates about 75% of the .opti_mcc.list files before having issues. It seems that I can pass the Files file through multiple times with different samples included so I could potentially generate all the .opti.mcc.list files for all my samples. My question is how I could then generate the “stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.pick.opti_mcc.list” that you use as an input into make.shared. Is that possible? It seems like there is some command at the end of cluster.split that concatenates all the opti_mcc.list files for each sample into a grouped opti_mcc.list and I am trying to figure out how to do this once I have generated all the individual files through separate runs of the second cluster.split command.

Why not use dist.seqs with cutoff=0.03 and opticlust?

In a previous thread, the suggestion was to split the cluster.split commands for processing. I just can’t seem to get it to process the second cluster.split command together - I can get it to run piecemeal - generating the individual sample .opti_mcc.list files.

You would have to do it manually or write a script to concatenate the individual list files.

Is there a way to concatenate the files in mothur? I looked at a few options but none of them seemed appropriate.

Sorry but there isn’t. The commands assume that you run them all the way through with cluster.split.

you likely need to drop your processors. you need more ram than your largest dist file x number of processors requested. so if you have a 50Gb dist and you want to use 8 processors, you need 400Gb ram.