Running the MiSeq Mothur protocol (iMac, MacOS sierra, Mothur 1.38, oodles of disk space) on a set of samples (~80 sputum samples from patients with cystic fibrosis). Everything was going great until I hit the cluster command, for which I get this error message:
[ERROR]: HWI-HWI-M04771_47_000000000-AUJLE_1_1105_19621_21354 is not in your count table. Please correct.
I then tried the cluster.split command and had the exact same error message. Both commands appear to execute properly until the error message hits. The dist file is about 207.6 GB; the count table is about 19 MB.
Let me add: the sequence noted in the error message indeed is not in the count table. Would it be as simple as adding it, along with values of ‘0’ for each of the samples?
Where there is an identifier for the sequence followed by the total count (here, 1) and the count for each sample.
The error message I received was:
[ERROR]: HWI-HWI-M04771_47_000000000-AUJLE_1_1105_19621_21354 is not in your count table. Please correct.
Notice the doubled “HWI-HWI”? I checked the count table and there are no entries with “HWI-HWI”. So now I’m wondering if the issue isn’t a missing sequence but a bug somewhere.
One quick further question: how long does it take for cluster.split to work with very large dist files? It generated the various smaller dist and temp files fairly quickly but now has been sitting for quite a while with no (obvious) activity. How long do I give it?
unfortunately, you can’t know that till you’ve done it. Generally, when clustering >100 samples on 4 processors (my server has 512gb ram) it will take between 12hrs and 3 days for the whole SOP to run.
Hi, I have the same Problem I am running Cluster.split after followed the exact protocol from MiseqSop, the only differences is that I’m using 18S, This is the command:
So cluster.split took just under 2 days (2 processors, 24 GB) for me. Perhaps in the future the command could have some sort of progress indicator to let the user know that it’s working and that Mothur hasn’t crashed. Just a thought.