mysterious seqID from cluster.split

k9power · April 30, 2013, 4:24am

Hello,

cluster.split gave an error message after splitting the distance file:
“Error: Sequence ‘05’ was not found in the names file, please correct”

here are the commands I’ve executed:

unique.seqs(fasta=seqAll.fasta)
align.seqs(fasta=seqAll.unique.fasta, reference=silva.bacteria.fasta, flip=t)
filter.seqs(fasta=seqAll.unique.align, vertical=T)
chimera.uchime(fasta=seqAll.unique.filter.fasta, name=seqAll.names, processors=2)
remove.seqs(fasta=seqAll.unique.filter.fasta, group=groupAll.groups, accnos=seqAll.unique.filter.uchime.accnos, name=seqAll.names)
dist.seqs(fasta=seqAll.unique.filter.pick.fasta, cutoff=0.1, processors=3)
cluster.split(column=seqAll.unique.filter.pick.dist, name=seqAll.pick.names, large=T, method=average, cutoff=0.05)

output I saw when running cluster.split:

Using 1 processors.
Using splitmethod distance.
Splitting the file...
It took 62636 seconds to split the distance file.

Reading seqAll.unique.filter.pick.dist.2.temp
********************#****#****#****#****#****#****#****#****#****#****#
Reading matrix:     ||||||AAError: Sequence '05' was not found in the names file, please correct

I checked, and there is no sequence ‘05’ in the distance file. All my sequence ID’s start with “gnl”
Just emailed seqAll.unique.filter.pick.dist.2.temp and seqAll.pick.names.2.temp to mothur.bugs

What could be the problem?
Thanks.

pschloss · April 30, 2013, 11:40am

I suspect you have a stray space or tab character somewhere in your files. Did you open and modify any of these text files?

k9power · April 30, 2013, 4:10pm

Nope, I didn’t touch any of the generated files.

westcott · May 1, 2013, 12:53pm

It appears the writing out of the split distance files had an error. I am not sure what caused it, but I can see the line in your distance file

gnl|SRA|SRR050533.2965.4 gnl|SRA|SRR050467.6375.4 0.02417
gnl|SRA|SRR050533.2965.4 gnl|SRA|SRR050478.5215.4 0.05405
05 |SRA|SRR050511.1115.4 0.05405
…

Can you try cluster.split(large=f, …)? Or try splitting by taxonomy? I may be able to track down the splitting issue if you send me the full distance and names files. I can’t reproduce it with our test data.

k9power · May 2, 2013, 1:52pm

Thanks, Sarah.
I have sent the files to you.

westcott · May 3, 2013, 6:50pm

Thanks for sending your files. I have them running now, and will let you know what I find.

westcott · May 6, 2013, 12:10pm

I was not able to reproduce the issue. Were you able to run the cluster.split command by taxonomy or with large=f?

Topic		Replies	Views
cluster split failing Commands in mothur	9	3668	October 14, 2015
Cluster.split problem Commands in mothur	1	2823	October 28, 2014
Error message when doing cluster.split Commands in mothur	6	5006	October 20, 2014
Errors in cluster.split Commands in mothur	11	378	December 28, 2023
problem while running MiSeq_SOP Commands in mothur	2	1494	March 8, 2016

mysterious seqID from cluster.split

Related topics