pre.cluster command not working

nd1091 · January 21, 2015, 4:38pm

Hello,

I am using qual and fasta file as raw data to analyze my data from ion torrent PGM. I am having trouble running the command pre.cluster. The log file says,

mothur > pre.cluster(fasta=nihar.trim.unique.good.filter.unique.fasta, name=nihar.trim.unique.good.filter.names, group=nihar.good.groups, diffs=2)

Using 2 processors.

[ERROR]: Your name file contains 1435574 valid sequences, and your groupfile contains 3317132, please correct.
[ERROR]: process 0 only processed 1 of 59 groups assigned to it, quitting.

/******************************************/
Running command: unique.seqs(fasta=nihar.trim.unique.good.filter.unique.precluster.fasta, name=nihar.trim.unique.good.filter.unique.precluster.names)
[ERROR]: nihar.trim.unique.good.filter.unique.precluster.fasta is blank, aborting.
Using nihar.trim.unique.good.filter.unique.fasta as input file for the fasta parameter.
[ERROR]: nihar.trim.unique.good.filter.unique.precluster.names is blank, aborting.
/******************************************/

and now if I proceed without a group file at this step it worked fine,

mothur > pre.cluster(fasta=nihar.trim.unique.good.filter.unique.fasta, name=nihar.trim.unique.good.filter.names, diffs=2)

Using 1 processors.
1055535 534969 520566
Total number of sequences before precluster was 1055535.
pre.cluster removed 520566 sequences.

It took 95640 secs to cluster 1055535 sequences.

Output File Names:
nihar.trim.unique.good.filter.unique.precluster.fasta
nihar.trim.unique.good.filter.unique.precluster.names
nihar.trim.unique.good.filter.unique.precluster.map

But now I am having problem because at the command make.shared I need a group file,

mothur > make.shared(list=final.an.list, group=final.groups, label=0.03)
Unable to open final.groups
You need to provide a groupfile or countfile if you are going to use the list format.
[ERROR]: did not complete make.shared.

mothur > make.shared(list=final.an.lis, label=0.03)
Unable to open final.an.lis
Using final.an.list as input file for the list parameter.
You need to provide a groupfile or countfile if you are going to use the list format.
[ERROR]: did not complete make.shared.

I do not know what should I do at this situation. It will be very helpful if anyone has any idea how to solve this problem.
Thanks,
Nihar

westcott · January 22, 2015, 8:12pm

It looks like you may have forgotten to include the group file on one of the commands before pre.cluster so it contains extra sequences. No worries, mothur can help, :). You can use the list.seqs http://www.mothur.org/wiki/List.seqs and get.seqs http://www.mothur.org/wiki/Get.seqs commands to select the sequences you want.

mothur > list.seqs(name=nihar.trim.unique.good.filter.names) - lists all the sequences in your names file
mothur > get.seqs(accnos=current, group=nihar.good.groups) - selects those sequences from the group file
mothur > pre.cluster(fasta=nihar.trim.unique.good.filter.unique.fasta, name=current, group=current, diffs=2)

nd1091 · January 27, 2015, 3:47pm

Thanks, it worked fine. So I followed the 454 SOP. As I did not have any MOCK so I skipped the “Error Analysis” part. So now I am at dist.seqs step and it is taking more than 3 days to complete. It is showing 2 output files in making final.dist and final.dist0.temp and both are over 70GB each. The SOP says if this file >100GB there is something wrong. Last time I completed this step with a 50 GB final.dist output file. Should I stop now? I can’t see where it went wrong.

westcott · January 28, 2015, 2:38pm

You may be interested in this, http://blog.mothur.org/2014/09/11/Why-such-a-large-distance-matrix%3F/.

Topic		Replies	Views
An error occurs while running pre.cluster command Commands in mothur	7	733	February 13, 2023
Help in pre.cluster Commands in mothur	3	329	August 18, 2023
Pre.cluster is very lslow and the fasta file which it produce is blank	6	551	February 20, 2024
Issue with pre.cluster Commands in mothur	10	543	October 30, 2023
Pre.cluster not working and quit mothur mothur bugs	6	1116	August 9, 2019

pre.cluster command not working

Related topics