I am not sure whether this is a bug, a code error on my part or a problem with the sequences, but upon running the command screen.seqs mothur ends up removing all the sequences from all my groups. I am pasting the log file output. What am I doing wrong? I am by no means an expert on this so any help will be much appreciated! (this was done on version 1.40.5)
Windows version
Running 64Bit Version
mothur v.1.40.5
Last updated: 06/19/2018
by
Patrick D. Schloss
Department of Microbiology & Immunology
University of Michigan
http://www.mothur.org
When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.
Distributed under the GNU General Public License
Type 'help()' for information on the commands that are available
For questions and analysis support, please visit our forum at https://www.mothur.org/forum
Type 'quit()' to exit program
Interactive Mode
mothur >
set.current(processors=2)
Using 2 processors.
Current files saved by mothur:
processors=2
Output File Names:
current_files.summary
mothur >
trim.seqs(fasta=022018JLR799F_full.fasta, qfile=022018JLR799F_full.qual, qaverage=25, oligos=022018JLR799F.oligos, pdiffs=2, bdiffs=2) #remove bad quality sequences
Using 2 processors.
It took 1037 secs to trim 3172122 sequences.
Group count:
FY1 158765
FY10 54654
FY11 42572
FY12 55120
FY13 80105
FY14 76652
FY15 92484
FY16 119454
FY17 91767
FY18 102996
FY19 115765
FY2 136254
FY20 120967
FY21 93268
FY22 78479
FY23 49173
FY24 68671
FY25 77354
FY26 120844
FY27 77314
FY28 81887
FY29 75021
FY3 118030
FY30 85722
FY31 51704
FY32 42009
FY33 44813
FY34 146708
FY35 112787
FY36 105011
FY4 109096
FY5 87765
FY6 78356
FY7 100061
FY8 90265
FY9 28040
Total of all groups is 3169933
Output File Names:
022018JLR799F_full.trim.fasta
022018JLR799F_full.scrap.fasta
022018JLR799F_full.trim.qual
022018JLR799F_full.scrap.qual
022018JLR799F_full.groups
mothur >
summary.seqs(fasta=current)
Using 022018JLR799F_full.trim.fasta as input file for the fasta parameter.
Using 2 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 8 8 0 2 1
2.5%-tile: 1 375 375 0 4 79249
25%-tile: 1 390 390 0 5 792484
Median: 1 391 391 0 5 1584967
75%-tile: 1 393 393 0 5 2377450
97.5%-tile: 1 400 400 0 6 3090685
Maximum: 1 543 543 31 142 3169933
Mean: 1 389 389 0 5
# of Seqs: 3169933
It took 133 secs to summarize 3169933 sequences.
Output File Names:
022018JLR799F_full.trim.summary
mothur >
screen.seqs(fasta=current, group=current, maxambig=0, optimize=minlength-maxlength, maxhomop=6, criteria=90) #remove too long/short sequences
Using 022018JLR799F_full.trim.fasta as input file for the fasta parameter.
Using 022018JLR799F_full.groups as input file for the group parameter.
Using 2 processors.
Optimizing minlength to 386.
Optimizing maxlength to 394.
It took 44 secs to screen 3169933 sequences, removed 523707.
/******************************************/
Running command: remove.seqs(accnos=022018JLR799F_full.trim.bad.accnos, group=022018JLR799F_full.groups)
Removed 523707 sequences from your group file.
Output File Names:
022018JLR799F_full.pick.groups
/******************************************/
Output File Names:
022018JLR799F_full.trim.good.fasta
022018JLR799F_full.trim.bad.accnos
022018JLR799F_full.good.groups
It took 128 secs to screen 3169933 sequences.
mothur >
summary.seqs(fasta=current)
Using 022018JLR799F_full.trim.good.fasta as input file for the fasta parameter.
Using 2 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 386 386 0 3 1
2.5%-tile: 1 386 386 0 4 66156
25%-tile: 1 390 390 0 5 661557
Median: 1 391 391 0 5 1323114
75%-tile: 1 392 392 0 5 1984670
97.5%-tile: 1 394 394 0 6 2580071
Maximum: 1 394 394 0 6 2646226
Mean: 1 390 390 0 5
# of Seqs: 2646226
It took 117 secs to summarize 2646226 sequences.
Output File Names:
022018JLR799F_full.trim.good.summary
mothur >
unique.seqs(fasta=current) #makes operation faster, since it ignores duplicate/identical sequence for calculation purposes
Using 022018JLR799F_full.trim.good.fasta as input file for the fasta parameter.
2646226 1678409
Output File Names:
022018JLR799F_full.trim.good.names
022018JLR799F_full.trim.good.unique.fasta
mothur >
count.seqs(name=current, group=current)
Using 022018JLR799F_full.good.groups as input file for the group parameter.
Using 022018JLR799F_full.trim.good.names as input file for the name parameter.
It took 51 secs to create a table for 2646226 sequences.
Total number of sequences: 2646226
Output File Names:
022018JLR799F_full.trim.good.count_table
mothur >
summary.seqs(count=current)
Using 022018JLR799F_full.trim.good.count_table as input file for the count parameter.
Using 022018JLR799F_full.trim.good.unique.fasta as input file for the fasta parameter.
Using 2 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 386 386 0 3 1
2.5%-tile: 1 386 386 0 4 66156
25%-tile: 1 390 390 0 5 661557
Median: 1 391 391 0 5 1323114
75%-tile: 1 392 392 0 5 1984670
97.5%-tile: 1 394 394 0 6 2580071
Maximum: 1 394 394 0 6 2646226
Mean: 1 390 390 0 5
# of unique seqs: 1678409
total # of seqs: 2646226
It took 51 secs to summarize 2646226 sequences.
Output File Names:
022018JLR799F_full.trim.good.unique.summary
mothur >
align.seqs(fasta=current, reference=silva.bacteria.fasta) #align sequences to the reference database, but use the correct one.
Using 022018JLR799F_full.trim.good.unique.fasta as input file for the fasta parameter.
Using 2 processors.
Reading in the silva.bacteria.fasta template sequences... DONE.
It took 24 to read 14956 sequences.
Aligning sequences from 022018JLR799F_full.trim.good.unique.fasta ...
It took 8643 secs to align 1678409 sequences.
[WARNING]: 33 of your sequences generated alignments that eliminated too many bases, a list is provided in 022018JLR799F_full.trim.good.unique.flip.accnos.
[NOTE]: 31 of your sequences were reversed to produce a better alignment.
Output File Names:
022018JLR799F_full.trim.good.unique.align
022018JLR799F_full.trim.good.unique.align.report
022018JLR799F_full.trim.good.unique.flip.accnos
mothur >
summary.seqs(fasta=current, count=current)
Using 022018JLR799F_full.trim.good.count_table as input file for the count parameter.
Using 022018JLR799F_full.trim.good.unique.align as input file for the fasta parameter.
Using 2 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1044 1058 7 0 3 1
2.5%-tile: 25292 37810 386 0 4 66156
25%-tile: 25292 37810 390 0 5 661557
Median: 25292 37811 391 0 5 1323114
75%-tile: 25292 37811 392 0 5 1984670
97.5%-tile: 25293 37811 394 0 6 2580071
Maximum: 25503 39358 394 0 6 2646226
Mean: 25291 37810 390 0 5
# of unique seqs: 1678409
total # of seqs: 2646226
It took 4653 secs to summarize 2646226 sequences.
Output File Names:
022018JLR799F_full.trim.good.unique.summary
mothur >
screen.seqs(fasta=current, count=current, summary=current, optimize=start-end-minlength)
Using 022018JLR799F_full.trim.good.count_table as input file for the count parameter.
Using 022018JLR799F_full.trim.good.unique.align as input file for the fasta parameter.
Using 022018JLR799F_full.trim.good.unique.summary as input file for the summary parameter.
Using 2 processors.
Optimizing start to 25292.
Optimizing end to 37810.
Optimizing minlength to 388.
It took 5554 secs to screen 1678409 sequences, removed 236446.
/******************************************/
Running command: remove.seqs(accnos=022018JLR799F_full.trim.good.unique.bad.accnos, count=022018JLR799F_full.trim.good.count_table)
Removing group: FY10 because all sequences have been removed.
Removing group: FY11 because all sequences have been removed.
Removing group: FY12 because all sequences have been removed.
Removing group: FY13 because all sequences have been removed.
Removing group: FY14 because all sequences have been removed.
Removing group: FY15 because all sequences have been removed.
Removing group: FY16 because all sequences have been removed.
Removing group: FY17 because all sequences have been removed.
Removing group: FY18 because all sequences have been removed.
Removing group: FY19 because all sequences have been removed.
Removing group: FY2 because all sequences have been removed.
Removing group: FY20 because all sequences have been removed.
Removing group: FY21 because all sequences have been removed.
Removing group: FY22 because all sequences have been removed.
Removing group: FY23 because all sequences have been removed.
Removing group: FY24 because all sequences have been removed.
Removing group: FY25 because all sequences have been removed.
Removing group: FY26 because all sequences have been removed.
Removing group: FY27 because all sequences have been removed.
Removing group: FY28 because all sequences have been removed.
Removing group: FY29 because all sequences have been removed.
Removing group: FY3 because all sequences have been removed.
Removing group: FY31 because all sequences have been removed.
Removing group: FY32 because all sequences have been removed.
Removing group: FY33 because all sequences have been removed.
Removing group: FY34 because all sequences have been removed.
Removing group: FY35 because all sequences have been removed.
Removing group: FY36 because all sequences have been removed.
Removing group: FY4 because all sequences have been removed.
Removing group: FY5 because all sequences have been removed.
Removing group: FY6 because all sequences have been removed.
Removing group: FY7 because all sequences have been removed.
Removing group: FY8 because all sequences have been removed.
Removing group: FY9 because all sequences have been removed.
Removed 263754 sequences from your count file.
Output File Names:
022018JLR799F_full.trim.good.pick.count_table
/******************************************/
Output File Names:
022018JLR799F_full.trim.good.unique.good.summary
022018JLR799F_full.trim.good.unique.good.align
022018JLR799F_full.trim.good.unique.bad.accnos
022018JLR799F_full.trim.good.good.count_table
It took 6717 secs to screen 1678409 sequences.
mothur >
summary.seqs(fasta=current, count=current)
Using 022018JLR799F_full.trim.good.good.count_table as input file for the count parameter.
Using 022018JLR799F_full.trim.good.unique.good.align as input file for the fasta parameter.
Using 2 processors.
[ERROR]: 'M02542_54_000000000-BL95D_1_1104_26786_19902' is not in your name or count file, please correct.
[ERROR]: 'M02542_54_000000000-BL95D_1_2115_18215_15631' is not in your name or count file, please correct.