Dear Mothur friends:
We try to analyze 16S data (illumina V3+V4, 97 samples, 20Gb, data) using a batch mode modified from the stability.batch showed in the Miseq SOP using a ubuntu server with dual cpu and 380gb RAM. The number of processor used in the analysis has currently been reduced to 28, but still got killed in the middle (not quite sure where it is, probably cluster.split). I will try to further reduce the number of processor or taxlevel to see how it will go. If I would like to used the data generated by those command before the one got killed to save time and avoid to run the batch file from start all over again, how should I modify the batch file to do so. Any suggestion to overcome the obstacle will be highly appreciated. The batched file and terminal information showed in the killed step are shown in bellow.
sincerely
Jrhau
REFERENCE_LOCATION=/media/mpiu/a93b0b36-e288-45ef-b21f-acc26e4b0af9/Bacteria-16S-Ref
ALIGNREF=silva.full_v138.fasta
TAXONREF_FASTA=trainset9_032012.pds.fasta
TAXONREF_TAX=trainset9_032012.pds.tax
CONTAMINENTS=Chloroplast-Mitochondria-unknown-Archaea-Eukaryota
LOGNAME=20201026-trial2
DATA=/media/mpiu/a93b0b36-e288-45ef-b21f-acc26e4b0af9/tooth-16S/20201026-trial2
TYPE=fastq
PROC=28
#batch commands
set.logfile(name=$LOGNAME)
make.file(inputdir=$DATA, type=$TYPE, prefix=stability)
make.contigs(file=current, processors=$PROC)
screen.seqs(fasta=current, group=current, maxambig=0, maxlength=500)
unique.seqs()
count.seqs(name=current, group=current)
align.seqs(fasta=current, reference=$REFERENCE_LOCATION/$ALIGNREF)
# screen.seqs(fasta=current, count=current, start=6000, end=26000, maxhomop=8)
screen.seqs(fasta=current, count=current, start=6388, end=25316, maxhomop=8)
filter.seqs(fasta=current, vertical=T, trump=.)
unique.seqs(fasta=current, count=current)
pre.cluster(fasta=current, count=current, diffs=2)
chimera.vsearch(fasta=current, count=current, dereplicate=t)
remove.seqs(fasta=current, accnos=current)
classify.seqs(fasta=current, count=current, reference=$REFERENCE_LOCATION/$TAXONREF_FASTA, taxonomy=$REFERENCE_LOCATION/$TAXONREF_TAX, cutoff=80)
remove.lineage(fasta=current, count=current, taxonomy=current, taxon=$CONTAMINENTS)
remove.groups(count=current, fasta=current, taxonomy=current, groups=Mock)
cluster.split(fasta=current, count=current, taxonomy=current, splitmethod=classify, taxlevel=4, cutoff=0.15)
make.shared(list=current, count=current, label=0.03)
classify.otu(list=current, count=current, taxonomy=current, label=0.03)
phylotype(taxonomy=current)
make.shared(list=current, count=current, label=1)
classify.otu(list=current, count=current, taxonomy=current, label=1)
Clustering /media/mpiu/a93b0b36-e288-45ef-b21f-acc26e4b0af9/tooth-16S/20201026-trial2/stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.pick.fasta.8.dist
tp tn fp fn sensitivity specificity ppv npv fdr accuracy mcc f1score
1.14933e+08 2.70999e+06 1.24432e+06 2.34844e+06 0.979976 0.685326 0.989289 0.535738 0.989289 0.970366 0.5910170.984611
tp tn fp fn sensitivity specificity ppv npv fdr accuracy mcc f1score
1.52984e+08 5.97733e+08 1.02162e+07 3.72478e+07 0.804198 0.983196 0.937401 0.94134 0.937401 0.940535 0.831814 0.865705
Clustering /media/mpiu/a93b0b36-e288-45ef-b21f-acc26e4b0af9/tooth-16S/20201026-trial2/stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.pick.fasta.9.dist
Clustering /media/mpiu/a93b0b36-e288-45ef-b21f-acc26e4b0af9/tooth-16S/20201026-trial2/stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.pick.fasta.15.dist
Clustering /media/mpiu/a93b0b36-e288-45ef-b21f-acc26e4b0af9/tooth-16S/20201026-trial2/stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.pick.fasta.17.dist
tp tn fp fn sensitivity specificity ppv npv fdr accuracy mcc f1score
2.65246e+08 1.57723e+08 8.99204e+06 1.41069e+08 0.652808 0.946063 0.967211 0.527869 0.967211 0.738127 0.5445080.779501
Killed