Segmentation Fault in Cluster

bbowman · March 7, 2014, 1:01am

I’m trying to improve on my pipeline for analyzing full-length 16S sequences from PacBio CCS, but after some recent changes I’ve started to get segmentation faults at the Clustering step and I’m not sure why…

“”"
mothur > align.seqs(fasta=Chun_0010.filter.fasta, flip=t, reference=/home/UNIXHOME/bbowman/data/references/16S/silva.both.align, processors=16)
mothur > summary.seqs(fasta=Chun_0010.filter.align)
mothur > screen.seqs(start=5251, fasta=Chun_0010.filter.align, end=38908, processors=16)
mothur > filter.seqs(fasta=Chun_0010.filter.good.align, vertical=T, processors=16)
mothur > unique.seqs(fasta=Chun_0010.filter.good.filter.fasta)
mothur > pre.cluster(diffs=4, fasta=Chun_0010.filter.good.filter.unique.fasta, name=Chun_0010.filter.good.filter.names)
mothur > chimera.uchime(fasta=Chun_0010.filter.good.filter.unique.precluster.fasta, processors=16, reference=/home/UNIXHOME/bbowman/data/references/16S/silva.gold.align)
mothur > remove.seqs(fasta=Chun_0010.filter.good.filter.unique.precluster.fasta, accnos=Chun_0010.filter.good.filter.unique.precluster.uchime.accnos)
mothur > dist.seqs(output=lt, fasta=Chun_0010.filter.good.filter.unique.precluster.pick.fasta, calc=onegap, processors=16, countends=F)
mothur > cluster(phylip=Chun_0010.filter.good.filter.unique.precluster.pick.phylip.dist, name=Chun_0010.filter.good.filter.unique.precluster.names, method=average)
“”"

The process reaches the clustering step consistently, but then seg-faults out during the reading in of the matrix. I’ve repeated this process on both the default version of Mothur I had installed (v1.30) and with the most recent stable build (v1.33.2) and observed the same behavior.

I can send the raw-data file and my logfiles if needed (fasta is ~3MB zipped)

-Brett

pschloss · March 7, 2014, 7:32pm

How big is the distance matrix? This generally happens when people run out of RAM, which can happen with a lot of high error rate data (cough pacbio cough). You might try the cluster.split approach.

bbowman · March 8, 2014, 2:44am

My biggest distance matrix is ~500MB, but I’m seeing this intermittently in matrices <100MB in size as well, so I suspect the cause is something other than the size.

Saxphile · April 23, 2014, 10:23pm

I’m seeing the same thing with a 15 MB dist file. Nearest neighbor fails 100% of the time, and average neighbor fails some of the time. Mothur 1.32 clusters the same dist file without any problem so must be a bug.

pschloss · April 24, 2014, 10:15am

If you guys could compress the distance matrix and name/count file and post it on google drive or email it to us we can take a look. Getting the exact command you are running would be helpful too.

Saxphile · May 2, 2014, 4:21am

Email sent.

Topic		Replies	Views
Segmentation fault using align.seqs mothur bugs	8	3888	November 2, 2015
clasify.seqs segmentation fault mothur bugs	1	2058	February 8, 2016
Segmentation fault classify.seqs mothur bugs	3	1118	December 21, 2017
cluster memory error mothur bugs	4	4619	October 27, 2012
Segmentation fault on Ubuntu 10.04 LTS mothur bugs	1	4095	July 27, 2012

Segmentation Fault in Cluster

Related topics