cluster.split issues

Hi,
I’m trying to run the cluster.split command
Here’s what i wrote : cluster.split(fasta=concatene.unique.pick.filter.filter.pick.pick.align,count=concatene.pick.pick.pick.count_table,taxonomy=taxo.pick.taxonomy,cutoff=0.03,taxlevel=7,processors=8)
After several days, i get : “process stopped”
To understand the issue, i try the debug flag (set.dir(debug=t)) and rerun cluster.split
And i get this:
Using 8 processors.
Using splitmethod fasta.
Splitting the file…
unknown unknown
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1102_26846_19171’ tax = ‘Bacteria;Firmicutes;Clostridia;Clostridiales;Ruminococcaceae;Ruminococcus;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1109_25868_16050’ tax = ‘Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_2102_3589_19726’ tax = ‘Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_2102_3575_19737’ tax = ‘Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_2106_19764_11438’ tax = ‘Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_2108_28397_18710’ tax = ‘Bacteria;Firmicutes;Clostridia;Clostridiales;Lachnospiraceae;Clostridium_XlVa;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_2111_25747_8206’ tax = ‘Bacteria;Fibrobacteres;Fibrobacteria;Fibrobacterales;Fibrobacteraceae;Fibrobacter;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1102_7611_17082’ tax = ‘Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1102_13444_19027’ tax = ‘Bacteria;Proteobacteria;Gammaproteobacteria;Orbales;Orbaceae;Frischella;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1103_17682_4347’ tax = ‘Bacteria;Proteobacteria;Gammaproteobacteria;Orbales;Orbaceae;Frischella;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1104_8058_4857’ tax = ‘Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Spirochaetaceae;Treponema;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1104_11178_14114’ tax = ‘Bacteria;Proteobacteria;Gammaproteobacteria;Orbales;Orbaceae;Frischella;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1104_14068_17084’ tax = ‘Bacteria;Actinobacteria;Actinobacteria;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1105_18959_7255’ tax = ‘Bacteria;Firmicutes;Negativicutes;Selenomonadales;Veillonellaceae;Megasphaera;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1105_20151_10243’ tax = ‘Bacteria;Firmicutes;Clostridia;Clostridiales;Ruminococcaceae;Sporobacter;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1105_25647_18364’ tax = ‘Bacteria;Fibrobacteres;Fibrobacteria;Fibrobacterales;Fibrobacteraceae;Fibrobacter;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1105_14195_20522’ tax = ‘Bacteria;Proteobacteria;Alphaproteobacteria;Rhodospirillales;Acetobacteraceae;Acetobacter;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1106_17146_10681’ tax = ‘Bacteria;Firmicutes;Erysipelotrichia;Erysipelotrichales;Erysipelotrichaceae;Sharpea;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1106_6859_23273’ tax = ‘Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1107_16075_5859’ tax = ‘Bacteria;Proteobacteria;Gammaproteobacteria;Orbales;Orbaceae;Frischella;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1107_11916_15994’ tax = ‘Bacteria;Proteobacteria;Gammaproteobacteria;Orbales;Orbaceae;Frischella;’
[DEBUG]: name = ‘HWI-M01323_171_000000000-AA4RL_1_1108_13261_1693’ tax = ‘Bacteria;Firmicutes;Negativicutes;Selenomonadales;Veillonellaceae;Megasphaera;’

I check my count_table and my taxonomy file to see if there were extra spaces at the end of line. But nothing.

count_table:
HWI-M01323_171_000000000-AA4RL_1_1105_14195_20522 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
HWI-M01323_171_000000000-AA4RL_1_1106_17146_10681 3 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0
HWI-M01323_171_000000000-AA4RL_1_1106_6859_23273 2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

taxonomy file:HWI-M01323_171_000000000-AA4RL_1_1105_14195_20522 Bacteria(100);Proteobacteria(100);Alphaproteobacteria(100);Rhodospirillales(100);Acetobacteraceae(100);Acetobacter(100);
HWI-M01323_171_000000000-AA4RL_1_1106_17146_10681 Bacteria(100);Firmicutes(100);Erysipelotrichia(100);Erysipelotrichales(100);Erysipelotrichaceae(100);Sharpea(100);
HWI-M01323_171_000000000-AA4RL_1_1106_6859_23273 Bacteria(100);Bacteroidetes(100);Bacteroidia(100);Bacteroidales(100);Prevotellaceae(100);Prevotella(100);

Thanks

I suspect you’re running out of RAM. How many unique sequences do you have? What region are you sequencing? How do you have 7 taxonomy levels?

A few suggestions…
0. You need to do cluster=0.20 if you want 0.03 data.

  1. Try processors=2 or 4
  2. If you are using levels 5 or 6 you might try classic=T in cluster.split
  3. If you aren’t sequencing the V4 region, you probably want to read this: http://blog.mothur.org/2014/09/11/Why-such-a-large-distance-matrix%3F/