Hi everybody
I work with mothur 1.48.0, when I run the pre.clusters command, all my sequences are removed.
My script :
make.contigs(file=fastq.files.table, oligos=miseq.LBE.16S.515_928.oligos.table, pdiffs=0, maxambig=0, maxhomop=8, processors=20) ## 220620 rename=T no function mothur 148 , moved up maxhomop=8 from screen.seqs below for consistency
rename.seqs(fasta=current)## 220620 removed group=current,
summary.seqs(fasta=current)
unique.seqs(fasta=current)
count.seqs(count=current)
summary.seqs(count=current)
align.seqs(fasta=current, reference=generic.LBE.16S.515_928.database.align)
summary.seqs(fasta=current, count=current)
screen.seqs(fasta=current, count=current, summary=current, start=180, end=13978)
summary.seqs(fasta=current, count=current)
filter.seqs(fasta=current, vertical=T, trump=.)
unique.seqs(fasta=current, count=current)
pre.cluster(fasta=fastq.files.trim.contigs.renamed.unique.good.filter.unique.fasta, count=fastq.files.trim.contigs.renamed.unique.good.filter.count_table, diffs=4, processors=1)
Logfile :
mothur > make.contigs(file=fastq.files.table, oligos=miseq.LBE.16S.515_928.oligos.table, pdiffs=0, maxambig=0, maxhomop=8, processors=20) ## 220620 rename=T no function mothur 148 , moved up maxhomop=8 from screen.seqs below for consistency
Using 20 processors.
>>>>> Processing file pair ABwet_EN_1_24_05_OJ_ErT_GG_V4&GGACGG&L001&R1.fastq - ABwet_EN_1_24_05_OJ_ErT_GG_V4&GGACGG&L001&R2.fastq (files 1 of 2) <<<<<
[WARNING]: your oligos file does not contain any group names. mothur will not create a groupfile.
Making contigs...
Done.
It took 4 secs to assemble 50815 reads.
>>>>> Processing file pair ABwet_En2_24_05_OJ_ErT_GG_V4&AGAGGG&L001&R1.fastq - ABwet_En2_24_05_OJ_ErT_GG_V4&AGAGGG&L001&R2.fastq (files 2 of 2) <<<<<
[WARNING]: your oligos file does not contain any group names. mothur will not create a groupfile.
Making contigs...
Done.
It took 4 secs to assemble 44531 reads.
Group count:
ABwet_EN_1_24_05_OJ_ErT_GG_V4 39119
ABwet_En2_24_05_OJ_ErT_GG_V4 33549
Total of all groups is 72668
It took 8 secs to process 95346 sequences.
Output File Names:
fastq.files.trim.contigs.fasta
fastq.files.scrap.contigs.fasta
fastq.files.contigs_report
fastq.files.contigs.count_table
mothur > rename.seqs(fasta=current)## 220620 removed group=current,
Using fastq.files.trim.contigs.fasta as input file for the fasta parameter.
Output File Names:
fastq.files.trim.contigs.renamed.fasta
fastq.files.trim.contigs.renamed_map
mothur > summary.seqs(fasta=current)
Using fastq.files.trim.contigs.renamed.fasta as input file for the fasta parameter.
Using 20 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 308 308 0 3 1
2.5%-tile: 1 373 373 0 4 1817
25%-tile: 1 375 375 0 4 18168
Median: 1 376 376 0 5 36335
75%-tile: 1 377 377 0 5 54502
97.5%-tile: 1 377 377 0 6 70852
Maximum: 1 461 461 0 8 72668
Mean: 1 375 375 0 4
# of Seqs: 72668
It took 1 secs to summarize 72668 sequences.
Output File Names:
fastq.files.trim.contigs.renamed.summary
mothur > unique.seqs(fasta=current)
Using fastq.files.trim.contigs.renamed.fasta as input file for the fasta parameter.
72668 25475
Output File Names:
fastq.files.trim.contigs.renamed.unique.fasta
fastq.files.trim.contigs.renamed.count_table
mothur > count.seqs(count=current) ##220620 removed name=current
Using fastq.files.trim.contigs.renamed.count_table as input file for the count parameter.
Output File Names:
fastq.files.trim.contigs.renamed.sparse.count_table
mothur > summary.seqs(count=current)
Using fastq.files.trim.contigs.renamed.sparse.count_table as input file for the count parameter.
Using fastq.files.trim.contigs.renamed.unique.fasta as input file for the fasta parameter.
Using 20 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 308 308 0 3 1
2.5%-tile: 1 373 373 0 4 1817
25%-tile: 1 375 375 0 4 18168
Median: 1 376 376 0 5 36335
75%-tile: 1 377 377 0 5 54502
97.5%-tile: 1 377 377 0 6 70852
Maximum: 1 461 461 0 8 72668
Mean: 1 375 375 0 4
# of unique seqs: 25475
total # of seqs: 72668
It took 0 secs to summarize 72668 sequences.
Output File Names:
fastq.files.trim.contigs.renamed.unique.summary
mothur > align.seqs(fasta=current, reference=generic.LBE.16S.515_928.database.align)
Using fastq.files.trim.contigs.renamed.unique.fasta as input file for the fasta parameter.
Unable to open generic.LBE.16S.515_928.database.align. Trying MOTHUR_FILES directory /mnt/sequencing/mothurdb/generic.LBE.16S.515_928.database.align.
Using 20 processors.
Reading in the /mnt/sequencing/mothurdb/generic.LBE.16S.515_928.database.align template sequences... DONE.
It took 48 to read 213119 sequences.
Aligning sequences from fastq.files.trim.contigs.renamed.unique.fasta ...
It took 40 secs to align 25475 sequences.
It took 41 seconds to align 25475 sequences.
Output File Names:
fastq.files.trim.contigs.renamed.unique.align
fastq.files.trim.contigs.renamed.unique.align_report
mothur > summary.seqs(fasta=current, count=current)
Using fastq.files.trim.contigs.renamed.sparse.count_table as input file for the count parameter.
Using fastq.files.trim.contigs.renamed.unique.align as input file for the fasta parameter.
Using 20 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 173 5790 231 0 3 1
2.5%-tile: 180 13978 373 0 4 1817
25%-tile: 180 13978 375 0 4 18168
Median: 180 13978 376 0 5 36335
75%-tile: 180 13978 377 0 5 54502
97.5%-tile: 180 13978 377 0 6 70852
Maximum: 8110 14744 459 0 8 72668
Mean: 180 13976 375 0 4
# of unique seqs: 25475
total # of seqs: 72668
It took 0 secs to summarize 72668 sequences.
Output File Names:
fastq.files.trim.contigs.renamed.unique.summary
mothur > screen.seqs(fasta=current, count=current, summary=current, start=180, end=13978)
Using fastq.files.trim.contigs.renamed.sparse.count_table as input file for the count parameter.
Using fastq.files.trim.contigs.renamed.unique.align as input file for the fasta parameter.
Using fastq.files.trim.contigs.renamed.unique.summary as input file for the summary parameter.
Using 20 processors.
It took 2 secs to screen 25475 sequences, removed 92.
/******************************************/
Running command: remove.seqs(accnos=fastq.files.trim.contigs.renamed.unique.bad.accnos.temp, count=fastq.files.trim.contigs.renamed.sparse.count_table)
Removed 118 sequences from fastq.files.trim.contigs.renamed.sparse.count_table.
Output File Names:
fastq.files.trim.contigs.renamed.sparse.pick.count_table
/******************************************/
Output File Names:
fastq.files.trim.contigs.renamed.unique.good.summary
fastq.files.trim.contigs.renamed.unique.good.align
fastq.files.trim.contigs.renamed.unique.bad.accnos
fastq.files.trim.contigs.renamed.sparse.good.count_table
It took 2 secs to screen 25475 sequences.
mothur > summary.seqs(fasta=current, count=current)
Using fastq.files.trim.contigs.renamed.sparse.good.count_table as input file for the count parameter.
Using fastq.files.trim.contigs.renamed.unique.good.align as input file for the fasta parameter.
Using 20 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 173 13978 362 0 3 1
2.5%-tile: 180 13978 373 0 4 1814
25%-tile: 180 13978 375 0 4 18138
Median: 180 13978 376 0 5 36276
75%-tile: 180 13978 377 0 5 54413
97.5%-tile: 180 13978 377 0 6 70737
Maximum: 180 14744 381 0 8 72550
Mean: 179 13978 375 0 4
# of unique seqs: 25383
total # of seqs: 72550
It took 1 secs to summarize 72550 sequences.
Output File Names:
fastq.files.trim.contigs.renamed.unique.good.summary
mothur > filter.seqs(fasta=current, vertical=T, trump=.)
Using fastq.files.trim.contigs.renamed.unique.good.align as input file for the fasta parameter.
Using 20 processors.
Creating Filter...
It took 1 secs to create filter for 25383 sequences.
Running Filter...
It took 0 secs to filter 25383 sequences.
Length of filtered alignment: 542
Number of columns removed: 14202
Length of the original alignment: 14744
Number of sequences used to construct filter: 25383
Output File Names:
fastq.filter
fastq.files.trim.contigs.renamed.unique.good.filter.fasta
mothur > unique.seqs(fasta=current, count=current)
Using fastq.files.trim.contigs.renamed.sparse.good.count_table as input file for the count parameter.
Using fastq.files.trim.contigs.renamed.unique.good.filter.fasta as input file for the fasta parameter.
25383 25383
Output File Names:
fastq.files.trim.contigs.renamed.unique.good.filter.unique.fasta
fastq.files.trim.contigs.renamed.unique.good.filter.count_table
mothur > pre.cluster(fasta=fastq.files.trim.contigs.renamed.unique.good.filter.unique.fasta, count=fastq.files.trim.contigs.renamed.unique.good.filter.count_table, diffs=4, processors=1)
Using 1 processors.
25383 3027 22356
Total number of sequences before precluster was 25383.
pre.cluster removed 22356 sequences.
/******************************************/
[WARNING]: fastq.files.trim.contigs.renamed.unique.good.filter.unique.fasta does not contain any sequence from the .accnos file.
Selected 0 sequences from fastq.files.trim.contigs.renamed.unique.good.filter.unique.fasta.
Output File Names:
fastq.files.trim.contigs.renamed.unique.good.filter.unique.precluster.fasta
I don’t understand why all the sequences are removed
Thank you for your help