pre.cluster problem

davevanh · September 23, 2014, 4:45pm

I’m getting the following error when I run the pre.cluster command:

Your name file contains 40766 valid sequences, and your groupfile contains 81532, please correct.

The command was executed as follows:

pre.cluster(fasta=MDVExhaustive_Mothur_Mod.unique.good.filter.unique.fasta, name=MDVExhaustive_Mothur_Mod.unique.good.filter.unique.accnos, group=GroupFile_Unique.group, diffs=2, processors=8)

I created my name file by using the list.seqs command with my fasta:

_list.seqs(fasta=MDVExhaustive_Mothur_Mod.unique.good.filter.unique.fasta)

Output File Names:
MDVExhaustive_Mothur_Mod.unique.good.filter.unique.accnos_

I then used R to create my group file from my name file.

My fasta, name, and group files all have the same number of sequences as verified using grep:

grep -o '’ MDVExhaustive_Mothur_Mod.unique.good.filter.unique.accnos | wc -l
(81,532)

grep -o ‘_’ GroupFile_Unique.group | wc -l
(81,532)

grep -o ‘>’ MDVExhaustive_Mothur_Mod.unique.good.filter.unique.fasta | wc -l
(81,532)_

The output gives me a list of all of the missing names, however, when I check my names file those names are actually present. The names it tells me are missing are distributed throughout my names file - they are not in a single cluster. Any help would be greatly appreciated!

Dave

pschloss · September 25, 2014, 5:33pm

Dave-

You appear to be giving pre.cluster your accnos file, not a names file. Can you try again?

pat

guangbin1980 · October 13, 2014, 2:02am

I meet the same problem in pre.cluster commond, the result show as below:
"Processing group MID1.archaea:
Error: diffs is greater than your sequence length.

[ERROR]: Your name file contains 0 valid sequences, and your groupfile contains 423, please correct.

[ERROR]: Your name file contains 0 valid sequences, and your groupfile contains 564, please correct.

[ERROR]: Your name file contains 0 valid sequences, and your groupfile contains 564, please correct.
[ERROR]: process 0 only processed 1 of 3 groups assigned to it, quitting.
[ERROR]: process 1 only processed 1 of 3 groups assigned to it, quitting.
[ERROR]: process 2 only processed 1 of 3 groups assigned to it, quitting. "
The commond I used was “pre.cluster(fasta=njsys.shhh.trim.unique.good.filter.unique.fasta, name=njsys.shhh.trim.unique.good.filter.names, group=njsys.shhh.good.groups, diffs=2)”, which is same as SOP, but my sample comtains both bacteria and archaea sequences.
we used the mixed reference file include silva.bacteria.fasta and silva.archaea.fasta.

pschloss · October 20, 2014, 9:26pm

"Processing group MID1.archaea:
Error: diffs is greater than your sequence length.

That’s your problem. I suspect this comes from your screen.seqs command. If you could start a new thread and post summary.seqs output before and after screen.seqs as well as what you are doing for screen.seqs that would be great.

Pat

Topic		Replies	Views
no equal numbers of sequences between name and group file mothur bugs	6	6865	May 5, 2012
Error in pre.cluster command mothur bugs	1	5187	July 18, 2012
Pre.cluster is very lslow and the fasta file which it produce is blank	6	541	February 20, 2024
Another issue...Pre.cluster Commands in mothur	3	2614	October 19, 2015
pre.cluster crash mothur bugs	3	3684	August 8, 2013

pre.cluster problem

Related topics