Dear Pat,
I am new mothur user. I know you have answered this type of question before ,but I really need to get some suggestions for my sequence run issue. As a summary,I sequenced V3-V4 region of 16S rRNA gene (soil DNA) through Illumina Miseq sequence (paired end reads).
Initially I had 6.8 million reads (6842673)
After unique.seqs run I had 1917220 unique sequences
Then I aligned my primer pair (338F and S4 )with 16S rRNA refernce sequence and this customised alignment was aligned back with the silva.bacteria.fasta which reduced number of columns from 50,000 to 17011…
After, pre.cluster run, the remained unique sequences were 776405.
Then, Chimera.uchime was run for nearly three days and followed all Miseq SOP guide for analyse sequences except error rate assess since I haven’t had a sequenced mock community.
Then, cluster.split command ran for 8 days but ended up with run failure. The command for cluster.split as follows,
mothur > cluster.split(fasta=Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta, count=Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.count_table, taxonomy=Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy, splitmethod=classify, taxlevel=4, cutoff=0.15, processors=10)
A top part of log file,
mothur > cluster.split(fasta=Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta, count=Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.count_table, taxonomy=Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy, splitmethod=classify, taxlevel=4, cutoff=0.15, processors=10)
Using 10 processors.
Using splitmethod fasta.
Splitting the file…
/******************************************/
Running command: dist.seqs(fasta=Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.0.temp, processors=10, cutoff=0.155)
Using 10 processors.
/******************************************/
Output File Names:
Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.0.dist
It took 12521 seconds to calculate the distances for 19342 sequences.
/******************************************/
Running command: dist.seqs(fasta=Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.1.temp, processors=10, cutoff=0.155)
Using 10 processors.
/******************************************/
Output File Names:
Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.1.dist
It took 561 seconds to calculate the distances for 23947 sequences.
/******************************************/
Running command: dist.seqs(fasta=Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.2.temp, processors=10, cutoff=0.155)
Using 10 processors.
/******************************************/
Output File Names:
Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.2.dist
It took 143836 seconds to calculate the distances for 60467 sequences.
/******************************************/
Running command: dist.seqs(fasta=Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.3.temp, processors=10, cutoff=0.155)
Using 10 processors.
/******************************************/
Then it was end as follows, Clustering Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.0.dist ********************#****#****#****#****#****#****#****#****#****#****# Reading matrix: ||||||||||||||||||||||||||||||||||||||||||||||||||| ***********************************************************************
Clustering Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.14.dist
********************###########
Reading matrix: |||||||||||||||||||||||||||||||||||||||||||||||||||
Clustering Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.8.dist
********************###########
Reading matrix: |||||||||||||||||||||||||||||||||||||||||||||||||||
Clustering Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.10.dist
********************###########
Reading matrix: |||||||||||||||||||||||||||||||||||||||||||||||||||
Clustering Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.11.dist
Cutoff was 0.155 changed cutoff to 0.06
********************###########
Reading matrix: |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Clustering Dilhani.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta.2.dist
Cutoff was 0.155 changed cutoff to 0.06
Cutoff was 0.155 changed cutoff to 0.06
Cutoff was 0.155 changed cutoff to 0.06
Cutoff was 0.155 changed cutoff to 0.06
[ERROR]: Could not open 24778.temp
My questions, I ran this using server in university, the error resulted from running out of memory but there was 64Gb available to this process which should be more than enough.I know v3-v4 sequence read is not much good, but however I have to analyse my data. Any suggestions you can give me really appreciated.
Thank you.
Regards,
Dilhanide