Hello,
I’m working with 18S sequences (R1 only good quality) from Illumina sequencing. I used the fastq.info command to extract the fasta sequences and then did all the commands in the Mothur tutorial without make.file and make.contigs
Arrived at the pre.cluster level, I have deleted sequences and the output file is empty.
I increased the RAM to 150 giga and also tried the nucleotide difference level to 2, 3 then 4 between sequences. I increased then decreased the process level (10 - 64).
I’ve also run the chimera.vsearch command but it’s been running for about 10 hours.
Could you please help me?
Thanks
mothur > summary.seqs(fasta=current, count=current)
Using 18S_R1.good.unique.good.filter.count_table as input file for the count parameter.
Using 18S_R1.good.unique.good.filter.unique.fasta as input file for the fasta parameter.
Using 64 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 737 196 0 3 1
2.5%-tile: 1 828 264 0 4 66656
25%-tile: 1 828 274 0 4 666558
Median: 1 828 275 0 5 1333115
75%-tile: 1 828 279 0 5 1999672
97.5%-tile: 1 828 280 0 6 2599574
Maximum: 3 828 282 0 8 2666229
Mean: 1 827 275 0 4
# of unique seqs: 1153321
total # of seqs: 2666229
It took 75 secs to summarize 2666229 sequences.
Output File Names:
18S_R1.good.unique.good.filter.unique.summary
mothur > pre.cluster(fasta=18S_R1.good.unique.good.filter.unique.fasta, count=18S_R1.good.unique.good.filter.count_table, diffs= 3)
Using 10 processors.
When using running without group information mothur can only use 1 processor, continuing.
0 1026717 126604
1000 712377 440944
Total number of sequences before precluster was 1153321.
pre.cluster removed 484995 sequences.
/******************************************/
[WARNING]: 18S_R1.good.unique.good.filter.unique.fasta does not contain any sequence from the .accnos file.
Selected 0 sequences from 18S_R1.good.unique.good.filter.unique.fasta.
Output File Names:
18S_R1.good.unique.good.filter.unique.precluster.fasta
/******************************************/
Done.
It took 1455 secs to cluster 1153321 sequences.
Using 10 processors.
Output File Names:
18S_R1.good.unique.good.filter.unique.precluster.fasta
18S_R1.good.unique.good.filter.unique.precluster.count_table
18S_R1.good.unique.good.filter.unique.precluster.map
vsearch v2.15.2_linux_x86_64, 1007.8GB RAM, 64 cores
https://github.com/torognes/vsearch
Fatal error: Unable to read from file (18S_R1.good.unique.good.filter.unique.precluster.temp)
mothur > chimera.vsearch(fasta=current, count=current, dereplicate=t)
Using 18S_R1.good.unique.good.filter.unique.precluster.count_table as input file for the count parameter.
Using 18S_R1.good.unique.good.filter.unique.precluster.fasta as input file for the fasta parameter.
[ERROR]: 18S_R1.good.unique.good.filter.unique.precluster.fasta is blank, aborting.
Using 18S_R1.good.unique.good.filter.unique.precluster.fasta as input file for the fasta parameter.
Using 10 processors.
Unable to open vsearch. Trying mothur's executable directory vsearch.
Unable to open vsearch.
vsearch file does not exist. Checking path...
Found vsearch in your path, using /beegfs/data/hgbaguidi/miniconda3/envs/mothur148/bin//vsearch
Using vsearch version v2.15.2.
Checking sequences from 18S_R1.good.unique.good.filter.unique.precluster.fasta ...
When using template=self, mothur can only use 1 processor, continuing.
[ERROR]: 18S_R1.good.unique.good.filter.unique.precluster.fasta is blank. Please correct.
It took 2 secs to check your sequences. 0 chimeras were found.