I have 16S sequences generated using NextSeq2000 flowcell and when I got to unique.seqs, I was left with zero sequences. is there a different set of commands I can use or are the sequences not compatible with Mothur?
Can you post the commands you are running? If you could run summary.seqs
with the fasta and count files as you go through the steps that would be helpful for diagnosing the problem
Pat
i follow the sequence of commands right up to pre.cluster and the nextseq sequences are lost
Can you post the output of running summary.seqs
with the fasta and count files that are inputted to pre.cluster
? What region are you sequencing? If it isn’t V4, then what region are you using? What is the length of the nextseq reads?
Pat
Group count:
12SF15 111702
155SF15 121590
15SF15 88110
16SF15 106479
204SF15 128771
210SF15 157412
42SF15 89930
45SF15 78127
58SF15 129955
61SF15 119243
68SF15 101621
74SF15 148117
MockZymoPosP2C 4948
Total of all groups is 1386005
It took 641 secs to process 1386005 sequences.
Output File Names:
stability.trim.contigs.fasta
stability.scrap.contigs.fasta
stability.contigs_report
stability.contigs.count_table
mothur >
summary.seqs(fasta=stability.trim.contigs.fasta, count=stability.contigs.count_table)
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 35 35 0 2 1
2.5%-tile: 1 37 37 0 4 34651
25%-tile: 1 440 440 0 4 346502
Median: 1 460 460 0 5 693003
75%-tile: 1 465 465 0 6 1039504
97.5%-tile: 1 466 466 2 7 1351355
Maximum: 1 602 602 61 301 1386005
Mean: 1 428 428 0 5
# of unique seqs: 1386005
total # of seqs: 1386005
It took 26 secs to summarize 1386005 sequences.
Output File Names:
stability.trim.contigs.summary
mothur >
screen.seqs(fasta=stability.trim.contigs.fasta, count=stability.contigs.count_table, maxambig=0, maxlength=275, maxhomop=8)
Using 8 processors.
It took 6 secs to screen 1386005 sequences, removed 1303009.
/******************************************/
Running command: remove.seqs(accnos=stability.trim.contigs.bad.accnos.temp, count=stability.contigs.count_table)
Removed 1303009 sequences from stability.contigs.count_table.
Output File Names:
stability.contigs.pick.count_table
/******************************************/
Output File Names:
stability.trim.contigs.good.fasta
stability.trim.contigs.bad.accnos
stability.contigs.good.count_table
It took 26 secs to screen 1386005 sequences.
mothur >
unique.seqs(fasta=stability.trim.contigs.good.fasta, count=stability.contigs.good.count_table)
82996 4233
Output File Names:
stability.trim.contigs.good.unique.fasta
stability.trim.contigs.good.count_table
the sequences are v3/v4 region
the rest of the codes:
mothur >
summary.seqs(count=stability.trim.contigs.good.count_table)
Using stability.trim.contigs.good.unique.fasta as input file for the fasta parameter.
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 35 35 0 2 1
2.5%-tile: 1 37 37 0 3 2075
25%-tile: 1 37 37 0 4 20750
Median: 1 37 37 0 4 41499
75%-tile: 1 38 38 0 5 62248
97.5%-tile: 1 253 253 0 6 80922
Maximum: 1 257 257 0 8 82996
Mean: 1 48 48 0 4
# of unique seqs: 4233
total # of seqs: 82996
It took 0 secs to summarize 82996 sequences.
Output File Names:
stability.trim.contigs.good.unique.summary
mothur >
align.seqs(fasta=stability.trim.contigs.good.unique.fasta, reference=silva.v4.fasta)
Using 8 processors.
Reading in the silva.v4.fasta template sequences... DONE.
It took 2 to read 14956 sequences.
Aligning sequences from stability.trim.contigs.good.unique.fasta ...
It took 1 secs to align 4233 sequences.
[WARNING]: 53 of your sequences generated alignments that eliminated too many bases, a list is provided in stability.trim.contigs.good.unique.flip.accnos.
[NOTE]: 38 of your sequences were reversed to produce a better alignment.
It took 1 seconds to align 4233 sequences.
Output File Names:
stability.trim.contigs.good.unique.align
stability.trim.contigs.good.unique.align_report
stability.trim.contigs.good.unique.flip.accnos
mothur >
summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table)
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 1961 2 0 2 1
2.5%-tile: 1968 11550 37 0 3 2075
25%-tile: 10655 13422 37 0 4 20750
Median: 10657 13422 37 0 4 41499
75%-tile: 10657 13422 38 0 5 62248
97.5%-tile: 10659 13422 253 0 6 80922
Maximum: 13422 13424 257 0 8 82996
Mean: 10200 13319 48 0 4
# of unique seqs: 4233
total # of seqs: 82996
It took 1 secs to summarize 82996 sequences.
Output File Names:
stability.trim.contigs.good.unique.summary
mothur >
screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, start=1968, end=13422)
Using 8 processors.
It took 0 secs to screen 4233 sequences, removed 4233.
/******************************************/
Running command: remove.seqs(accnos=stability.trim.contigs.good.unique.bad.accnos.temp, count=stability.trim.contigs.good.count_table)
Removed 82996 sequences from stability.trim.contigs.good.count_table.
[WARNING]: stability.trim.contigs.good.count_table contains only sequences from the .accnos file.
Output File Names:
stability.trim.contigs.good.pick.count_table
Hi - the problem is in your screen.seqs
command where you set a maximum length of 275…
screen.seqs(fasta=stability.trim.contigs.fasta, count=stability.contigs.count_table, maxambig=0, maxlength=275, maxhomop=8)
Nearly all of your “good” sequences are longer than 275 nucleotides. I suspect you want to set minlength=400 rather than maxlength=275.
Also, you then go on to use the v4 silva file, which won’t cover the v5 region. This causes other nucleotides and seqeunces to get dropped leaving you with nothing left over.
Here’s what I’d suggest…
- Read this blogpost on the overall problem with sequencing a region like v4-v5 and using the 2x300 nt chemistry
- Read this tutorial on how to customize the silva database and min/max length settings for a non-V4 region
Let me know if you have any questions
Pat
I did and still didn’t work and I increased the reference alignment to cover the v3 and v4 region
Can you post the output from running summary.seqs
after you adjusted the screen.seqs
parameters?
Pat
just finished the run until the point were we start preparing for analysis. i have attached the entire run and what the count_table looks like
mothur >
make.file(inputdir=., type=fastq, prefix=stability)
Setting input directories to:
C:\Users\glory\Desktop\project\Mothur.win\mothur\
Output File Names:
C:\Users\glory\Desktop\project\Mothur.win\mothur\stability.files
mothur >
make.contigs(file=stability.files)
Using 8 processors.
>>>>> Processing file pair 100SF16_S41_R1_001.fastq - 100SF16_S41_R2_001.fastq (files 1 of 23) <<<<<
Making contigs...
Done.
It took 60 secs to assemble 128466 reads.
>>>>> Processing file pair 110SF16_S40_R1_001.fastq - 110SF16_S40_R2_001.fastq (files 2 of 23) <<<<<
Making contigs...
Done.
It took 54 secs to assemble 116020 reads.
>>>>> Processing file pair 12SF15_S26_R1_001.fastq - 12SF15_S26_R2_001.fastq (files 3 of 23) <<<<<
Making contigs...
Done.
It took 51 secs to assemble 111702 reads.
>>>>> Processing file pair 148SF16_S43_R1_001.fastq - 148SF16_S43_R2_001.fastq (files 4 of 23) <<<<<
Making contigs...
Done.
It took 63 secs to assemble 174406 reads.
>>>>> Processing file pair 155SF15_S29_R1_001.fastq - 155SF15_S29_R2_001.fastq (files 5 of 23) <<<<<
Making contigs...
Done.
It took 55 secs to assemble 121590 reads.
>>>>> Processing file pair 15SF15_S33_R1_001.fastq - 15SF15_S33_R2_001.fastq (files 6 of 23) <<<<<
Making contigs...
Done.
It took 43 secs to assemble 88110 reads.
>>>>> Processing file pair 16SF15_S36_R1_001.fastq - 16SF15_S36_R2_001.fastq (files 7 of 23) <<<<<
Making contigs...
Done.
It took 50 secs to assemble 106479 reads.
>>>>> Processing file pair 188SF16_S38_R1_001.fastq - 188SF16_S38_R2_001.fastq (files 8 of 23) <<<<<
Making contigs...
Done.
It took 64 secs to assemble 143694 reads.
>>>>> Processing file pair 204SF15_S31_R1_001.fastq - 204SF15_S31_R2_001.fastq (files 9 of 23) <<<<<
Making contigs...
Done.
It took 53 secs to assemble 128771 reads.
>>>>> Processing file pair 210SF15_S30_R1_001.fastq - 210SF15_S30_R2_001.fastq (files 10 of 23) <<<<<
Making contigs...
Done.
It took 77 secs to assemble 157412 reads.
>>>>> Processing file pair 25SF16_S42_R1_001.fastq - 25SF16_S42_R2_001.fastq (files 11 of 23) <<<<<
Making contigs...
Done.
It took 49 secs to assemble 121092 reads.
>>>>> Processing file pair 2SF16_S44_R1_001.fastq - 2SF16_S44_R2_001.fastq (files 12 of 23) <<<<<
Making contigs...
Done.
It took 48 secs to assemble 161221 reads.
>>>>> Processing file pair 42SF15_S35_R1_001.fastq - 42SF15_S35_R2_001.fastq (files 13 of 23) <<<<<
Making contigs...
Done.
It took 42 secs to assemble 89930 reads.
>>>>> Processing file pair 45SF15_S27_R1_001.fastq - 45SF15_S27_R2_001.fastq (files 14 of 23) <<<<<
Making contigs...
Done.
It took 34 secs to assemble 78127 reads.
>>>>> Processing file pair 55SF16_S37_R1_001.fastq - 55SF16_S37_R2_001.fastq (files 15 of 23) <<<<<
Making contigs...
Done.
It took 44 secs to assemble 97201 reads.
>>>>> Processing file pair 58SF15_S25_R1_001.fastq - 58SF15_S25_R2_001.fastq (files 16 of 23) <<<<<
Making contigs...
Done.
It took 59 secs to assemble 129955 reads.
>>>>> Processing file pair 61SF15_S32_R1_001.fastq - 61SF15_S32_R2_001.fastq (files 17 of 23) <<<<<
Making contigs...
Done.
It took 52 secs to assemble 119243 reads.
>>>>> Processing file pair 68SF15_S34_R1_001.fastq - 68SF15_S34_R2_001.fastq (files 18 of 23) <<<<<
Making contigs...
Done.
It took 50 secs to assemble 101621 reads.
>>>>> Processing file pair 74SF15_S28_R1_001.fastq - 74SF15_S28_R2_001.fastq (files 19 of 23) <<<<<
Making contigs...
Done.
It took 59 secs to assemble 148117 reads.
>>>>> Processing file pair 76SF16_S39_R1_001.fastq - 76SF16_S39_R2_001.fastq (files 20 of 23) <<<<<
Making contigs...
Done.
It took 74 secs to assemble 165908 reads.
>>>>> Processing file pair 78SF16_S45_R1_001.fastq - 78SF16_S45_R2_001.fastq (files 21 of 23) <<<<<
Making contigs...
Done.
It took 25 secs to assemble 57605 reads.
>>>>> Processing file pair 91SF16_S46_R1_001.fastq - 91SF16_S46_R2_001.fastq (files 22 of 23) <<<<<
Making contigs...
Done.
It took 73 secs to assemble 157391 reads.
>>>>> Processing file pair MockZymoPosP2C_S95_L001_R1_001.fastq - MockZymoPosP2C_S95_L001_R2_001.fastq (files 23 of 23) <<<<<
Making contigs...
Done.
It took 2 secs to assemble 4948 reads.
Group count:
100SF16 128466
110SF16 116020
12SF15 111702
148SF16 174406
155SF15 121590
15SF15 88110
16SF15 106479
188SF16 143694
204SF15 128771
210SF15 157412
25SF16 121092
2SF16 161221
42SF15 89930
45SF15 78127
55SF16 97201
58SF15 129955
61SF15 119243
68SF15 101621
74SF15 148117
76SF16 165908
78SF16 57605
91SF16 157391
MockZymoPosP2C 4948
Total of all groups is 2709009
It took 1223 secs to process 2709009 sequences.
Output File Names:
stability.trim.contigs.fasta
stability.scrap.contigs.fasta
stability.contigs_report
stability.contigs.count_table
mothur >
summary.seqs(fasta=stability.trim.contigs.fasta, count=stability.contigs.count_table)
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 35 35 0 2 1
2.5%-tile: 1 37 37 0 4 67726
25%-tile: 1 440 440 0 4 677253
Median: 1 440 440 0 5 1354505
75%-tile: 1 465 465 0 6 2031757
97.5%-tile: 1 466 466 2 7 2641284
Maximum: 1 602 602 61 301 2709009
Mean: 1 414 414 0 5
# of unique seqs: 2709009
total # of seqs: 2709009
It took 90 secs to summarize 2709009 sequences.
Output File Names:
stability.trim.contigs.summary
mothur >
screen.seqs(fasta=stability.trim.contigs.fasta, count=stability.contigs.count_table, maxambig=0, maxlength=466)
Using 8 processors.
It took 25 secs to screen 2709009 sequences, removed 480047.
/******************************************/
Running command: remove.seqs(accnos=stability.trim.contigs.bad.accnos.temp, count=stability.contigs.count_table)
Removed 480047 sequences from stability.contigs.count_table.
Output File Names:
stability.contigs.pick.count_table
/******************************************/
Output File Names:
stability.trim.contigs.good.fasta
stability.trim.contigs.bad.accnos
stability.contigs.good.count_table
It took 97 secs to screen 2709009 sequences.
mothur >
unique.seqs(fasta=stability.trim.contigs.good.fasta, count=stability.contigs.good.count_table)
2228962 823669
Output File Names:
stability.trim.contigs.good.unique.fasta
stability.trim.contigs.good.count_table
mothur >
summary.seqs(count=stability.trim.contigs.good.count_table)
Using stability.trim.contigs.good.unique.fasta as input file for the fasta parameter.
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 35 35 0 2 1
2.5%-tile: 1 37 37 0 3 55725
25%-tile: 1 440 440 0 4 557241
Median: 1 440 440 0 4 1114482
75%-tile: 1 465 465 0 6 1671722
97.5%-tile: 1 466 466 0 7 2173238
Maximum: 1 466 466 0 282 2228962
Mean: 1 407 407 0 5
# of unique seqs: 823669
total # of seqs: 2228962
It took 26 secs to summarize 2228962 sequences.
Output File Names:
stability.trim.contigs.good.unique.summary
mothur >
pcr.seqs(fasta=silva.bacteria.fasta, start=6388, end=25318, keepdots=F)
Using 8 processors.
[NOTE]: no sequences were bad, removing silva.bacteria.bad.accnos
It took 11 secs to screen 14956 sequences.
Output File Names:
silva.bacteria.pcr.fasta
mothur >
rename.file(input=silva.bacteria.pcr.fasta, new=silva.v3v4.fasta)
Current files saved by mothur:
accnos=stability.trim.contigs.bad.accnos
fasta=silva.bacteria.pcr.fasta
contigsreport=stability.contigs_report
count=stability.trim.contigs.good.count_table
processors=8
summary=stability.trim.contigs.good.unique.summary
file=C:\Users\glory\Desktop\project\Mothur.win\mothur\stability.files
mothur >
align.seqs(fasta=stability.trim.contigs.good.unique.fasta, reference=silva.v3v4.fasta)
Using 8 processors.
Reading in the silva.v3v4.fasta template sequences... DONE.
It took 9 to read 14956 sequences.
Aligning sequences from stability.trim.contigs.good.unique.fasta ...
It took 963 secs to align 823669 sequences.
[WARNING]: 1824 of your sequences generated alignments that eliminated too many bases, a list is provided in stability.trim.contigs.good.unique.flip.accnos.
[NOTE]: 416 of your sequences were reversed to produce a better alignment.
It took 968 seconds to align 823669 sequences.
Output File Names:
stability.trim.contigs.good.unique.align
stability.trim.contigs.good.unique.align_report
stability.trim.contigs.good.unique.flip.accnos
mothur >
summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table)
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 2 2 0 1 1
2.5%-tile: 1 18929 37 0 3 55725
25%-tile: 1 18929 440 0 4 557241
Median: 1 18929 440 0 4 1114482
75%-tile: 1 18929 465 0 6 1671722
97.5%-tile: 16164 18929 466 0 7 2173238
Maximum: 18929 18931 466 0 146 2228962
Mean: 1722 18916 407 0 4
# of unique seqs: 823669
total # of seqs: 2228962
It took 244 secs to summarize 2228962 sequences.
Output File Names:
stability.trim.contigs.good.unique.summary
mothur >
screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, start=1, end=18931)
Using 8 processors.
It took 94 secs to screen 823669 sequences, removed 822450.
/******************************************/
Running command: remove.seqs(accnos=stability.trim.contigs.good.unique.bad.accnos.temp, count=stability.trim.contigs.good.count_table)
Removed 2227635 sequences from stability.trim.contigs.good.count_table.
Output File Names:
stability.trim.contigs.good.pick.count_table
/******************************************/
Output File Names:
stability.trim.contigs.good.unique.good.align
stability.trim.contigs.good.unique.bad.accnos
stability.trim.contigs.good.good.count_table
It took 224 secs to screen 823669 sequences.
mothur >
summary.seqs(fasta=current, count=current)
Using stability.trim.contigs.good.good.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.good.align as input file for the fasta parameter.
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 18931 439 0 4 1
2.5%-tile: 1 18931 441 0 4 34
25%-tile: 1 18931 441 0 4 332
Median: 1 18931 441 0 5 664
75%-tile: 1 18931 461 0 6 996
97.5%-tile: 1 18931 466 0 29 1294
Maximum: 1 18931 466 0 141 1327
Mean: 1 18931 450 0 6
# of unique seqs: 1219
total # of seqs: 1327
It took 0 secs to summarize 1327 sequences.
Output File Names:
stability.trim.contigs.good.unique.good.summary
mothur >
filter.seqs(fasta=stability.trim.contigs.good.unique.good.align, vertical=T, trump=.)
Using 8 processors.
Creating Filter...
It took 0 secs to create filter for 1219 sequences.
Running Filter...
It took 0 secs to filter 1219 sequences.
Length of filtered alignment: 531
Number of columns removed: 18400
Length of the original alignment: 18931
Number of sequences used to construct filter: 1219
Output File Names:
stability.filter
stability.trim.contigs.good.unique.good.filter.fasta
mothur >
unique.seqs(fasta=stability.trim.contigs.good.unique.good.filter.fasta, count=stability.trim.contigs.good.good.count_table)
1219 1215
Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.fasta
stability.trim.contigs.good.unique.good.filter.count_table
mothur >
pre.cluster(fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, count=stability.trim.contigs.good.unique.good.filter.count_table, diffs=2)
Using 8 processors.
/******************************************/
Splitting by sample:
Using 8 processors.
Selecting sequences for groups 100SF16-110SF16
Selecting sequences for groups 12SF15-148SF16
Selecting sequences for groups 61SF15-68SF15-74SF15
Selecting sequences for groups 25SF16-2SF16-42SF15
Selecting sequences for groups 188SF16-204SF15-210SF15
Selecting sequences for groups 45SF15-55SF16-58SF15
Selecting sequences for groups 155SF15-15SF15-16SF15
Selecting sequences for groups 76SF16-78SF16-91SF16
Selected 45 sequences from 12SF15.
Selected 50 sequences from 45SF15.
Selected 59 sequences from 25SF16.
Selected 46 sequences from 148SF16.
Selected 17 sequences from 2SF16.
Selected 57 sequences from 55SF16.
Selected 52 sequences from 42SF15.
Selected 96 sequences from 58SF15.
Selected 57 sequences from 100SF16.
Selected 59 sequences from 110SF16.
Selected 81 sequences from 188SF16.
Selected 56 sequences from 204SF15.
Selected 76 sequences from 210SF15.
Selected 55 sequences from 61SF15.
Selected 32 sequences from 68SF15.
Selected 69 sequences from 74SF15.
Selected 46 sequences from 155SF15.
Selected 41 sequences from 15SF15.
Selected 68 sequences from 16SF15.
Selected 101 sequences from 76SF16.
Selected 32 sequences from 78SF16.
Selected 75 sequences from 91SF16.
It took 0 seconds to split the dataset by sample.
/******************************************/
Processing group 12SF15:
12SF15 45 43 2
Total number of sequences before pre.cluster was 45.
pre.cluster removed 2 sequences.
It took 0 secs to cluster 45 sequences.
Processing group 148SF16:
148SF16 46 45 1
Total number of sequences before pre.cluster was 46.
pre.cluster removed 1 sequences.
It took 0 secs to cluster 46 sequences.
Processing group 155SF15:
155SF15 46 45 1
Total number of sequences before pre.cluster was 46.
pre.cluster removed 1 sequences.
It took 0 secs to cluster 46 sequences.
Processing group 15SF15:
15SF15 41 39 2
Total number of sequences before pre.cluster was 41.
pre.cluster removed 2 sequences.
It took 0 secs to cluster 41 sequences.
Processing group 16SF15:
16SF15 68 68 0
Total number of sequences before pre.cluster was 68.
pre.cluster removed 0 sequences.
It took 0 secs to cluster 68 sequences.
Processing group 188SF16:
188SF16 81 77 4
Total number of sequences before pre.cluster was 81.
pre.cluster removed 4 sequences.
It took 0 secs to cluster 81 sequences.
Processing group 204SF15:
204SF15 56 56 0
Total number of sequences before pre.cluster was 56.
pre.cluster removed 0 sequences.
It took 0 secs to cluster 56 sequences.
Processing group 210SF15:
Processing group 25SF16:
25SF16 59 59 0
210SF15 76 74 2
Total number of sequences before pre.cluster was 59.
Total number of sequences before pre.cluster was 76.
pre.cluster removed 0 sequences.
pre.cluster removed 2 sequences.
It took 0 secs to cluster 59 sequences.
It took 0 secs to cluster 76 sequences.
Processing group 2SF16:
2SF16 17 17 0
Total number of sequences before pre.cluster was 17.
pre.cluster removed 0 sequences.
It took 0 secs to cluster 17 sequences.
Processing group 42SF15:
42SF15 52 49 3
Total number of sequences before pre.cluster was 52.
pre.cluster removed 3 sequences.
It took 0 secs to cluster 52 sequences.
Processing group 45SF15:
45SF15 50 49 1
Total number of sequences before pre.cluster was 50.
pre.cluster removed 1 sequences.
It took 0 secs to cluster 50 sequences.
Processing group 55SF16:
55SF16 57 53 4
Total number of sequences before pre.cluster was 57.
pre.cluster removed 4 sequences.
It took 0 secs to cluster 57 sequences.
Processing group 58SF15:
58SF15 96 90 6
Processing group 61SF15:
Total number of sequences before pre.cluster was 96.
pre.cluster removed 6 sequences.
It took 0 secs to cluster 96 sequences.
61SF15 55 55 0
Total number of sequences before pre.cluster was 55.
pre.cluster removed 0 sequences.
It took 0 secs to cluster 55 sequences.
Processing group 68SF15:
68SF15 32 32 0
Total number of sequences before pre.cluster was 32.
pre.cluster removed 0 sequences.
It took 0 secs to cluster 32 sequences.
Processing group 74SF15:
74SF15 69 69 0
Total number of sequences before pre.cluster was 69.
pre.cluster removed 0 sequences.
It took 0 secs to cluster 69 sequences.
Processing group 76SF16:
76SF16 101 91 10
Total number of sequences before pre.cluster was 101.
pre.cluster removed 10 sequences.
It took 0 secs to cluster 101 sequences.
Processing group 78SF16:
78SF16 32 32 0
Total number of sequences before pre.cluster was 32.
pre.cluster removed 0 sequences.
It took 0 secs to cluster 32 sequences.
Processing group 91SF16:
91SF16 75 62 13
Total number of sequences before pre.cluster was 75.
pre.cluster removed 13 sequences.
It took 0 secs to cluster 75 sequences.
Processing group 100SF16:
100SF16 57 56 1
Total number of sequences before pre.cluster was 57.
pre.cluster removed 1 sequences.
It took 0 secs to cluster 57 sequences.
Processing group 110SF16:
110SF16 59 59 0
Total number of sequences before pre.cluster was 59.
pre.cluster removed 0 sequences.
It took 0 secs to cluster 59 sequences.
Deconvoluting count table results...
It took 0 secs to merge 1220 sequences group data.
/******************************************/
Running get.seqs:
Selected 1176 sequences from stability.trim.contigs.good.unique.good.filter.unique.fasta.
/******************************************/
It took 1 secs to run pre.cluster.
Using 8 processors.
Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta
stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table
stability.trim.contigs.good.unique.good.filter.unique.precluster.100SF16.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.110SF16.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.12SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.148SF16.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.155SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.15SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.16SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.188SF16.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.204SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.210SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.25SF16.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.2SF16.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.42SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.45SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.55SF16.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.58SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.61SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.68SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.74SF15.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.76SF16.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.78SF16.map
stability.trim.contigs.good.unique.good.filter.unique.precluster.91SF16.map
mothur >
chimera.vsearch(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table, dereplicate=t)
Using 8 processors.
Checking sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta ...
/******************************************/
Splitting by sample:
Using 8 processors.
Selecting sequences for groups 12SF15-148SF16
Selecting sequences for groups 100SF16-110SF16
Selecting sequences for groups 25SF16-2SF16-42SF15
Selecting sequences for groups 155SF15-15SF15-16SF15
Selecting sequences for groups 45SF15-55SF16-58SF15
Selecting sequences for groups 61SF15-68SF15-74SF15
Selecting sequences for groups 76SF16-78SF16-91SF16
Selecting sequences for groups 188SF16-204SF15-210SF15
Selected 43 sequences from 12SF15.
Selected 45 sequences from 148SF16.
Selected 56 sequences from 100SF16.
Selected 59 sequences from 110SF16.
Selected 59 sequences from 25SF16.
Selected 17 sequences from 2SF16.
Selected 49 sequences from 42SF15.
Selected 45 sequences from 155SF15.
Selected 39 sequences from 15SF15.
Selected 68 sequences from 16SF15.
Selected 55 sequences from 61SF15.
Selected 32 sequences from 68SF15.
Selected 69 sequences from 74SF15.
Selected 91 sequences from 76SF16.
Selected 32 sequences from 78SF16.
Selected 49 sequences from 45SF15.
Selected 62 sequences from 91SF16.
Selected 53 sequences from 55SF16.
Selected 90 sequences from 58SF15.
Selected 77 sequences from 188SF16.
Selected 56 sequences from 204SF15.
Selected 74 sequences from 210SF15.
It took 0 seconds to split the dataset by sample.
/******************************************/
It took 1 secs to check 43 sequences from group 12SF15.
It took 1 secs to check 45 sequences from group 155SF15.
It took 2 secs to check 56 sequences from group 100SF16.
It took 2 secs to check 49 sequences from group 45SF15.
It took 2 secs to check 55 sequences from group 61SF15.
It took 2 secs to check 59 sequences from group 25SF16.
It took 2 secs to check 77 sequences from group 188SF16.
It took 2 secs to check 91 sequences from group 76SF16.
It took 0 secs to check 17 sequences from group 2SF16.
It took 1 secs to check 45 sequences from group 148SF16.
It took 1 secs to check 39 sequences from group 15SF15.
It took 1 secs to check 32 sequences from group 68SF15.
It took 1 secs to check 59 sequences from group 110SF16.
It took 1 secs to check 53 sequences from group 55SF16.
It took 1 secs to check 32 sequences from group 78SF16.
It took 1 secs to check 56 sequences from group 204SF15.
It took 1 secs to check 49 sequences from group 42SF15.
It took 2 secs to check 68 sequences from group 16SF15.
It took 1 secs to check 69 sequences from group 74SF15.
It took 1 secs to check 62 sequences from group 91SF16.
It took 1 secs to check 90 sequences from group 58SF15.
It took 1 secs to check 74 sequences from group 210SF15.
It took 4 secs to check 1220 sequences.
Removing chimeras from your input files:
/******************************************/
Running command: remove.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.accnos)
Removed 2 sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta.
Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta
/******************************************/
Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.chimeras
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.accnos
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta
mothur >
summary.seqs(fasta=current, count=current)
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta as input file for the fasta parameter.
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 531 439 0 4 1
2.5%-tile: 1 531 441 0 4 34
25%-tile: 1 531 441 0 4 332
Median: 1 531 441 0 5 663
75%-tile: 1 531 461 0 6 994
97.5%-tile: 1 531 466 0 24 1292
Maximum: 1 531 466 0 141 1325
Mean: 1 531 450 0 6
# of unique seqs: 1174
total # of seqs: 1325
It took 0 secs to summarize 1325 sequences.
Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.summary
mothur >
classify.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table, reference=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax)
Using 8 processors.
Reading template taxonomy... DONE.
Reading template probabilities... DONE.
It took 12 seconds get probabilities.
Classifying sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta ...
It took 5 secs to classify 1174 sequences.
It took 0 secs to create the summary file for 1174 sequences.
Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pds.wang.taxonomy
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pds.wang.tax.summary
mothur >
remove.lineage(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table, taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pds.wang.taxonomy, taxon=Chloroplast-Mitochondria-unknown-Archaea-Eukaryota)
/******************************************/
Running command: remove.seqs(accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pds.wang.accnos, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table, fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta)
Removed 4 sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.fasta.
Removed 4 sequences from stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.count_table.
Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.fasta
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table
/******************************************/
Output File Names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pds.wang.pick.taxonomy
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pds.wang.accnos
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.fasta
mothur >
get.groups(count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table, fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.fasta, groups=MockZymoPosP2C)
Your file does NOT contain sequences from the groups you wish to get.
Selected 0 sequences from your count file.
Your file does NOT contain sequences from the groups you wish to get.
Selected 0 sequences from your fasta file.
Output File names:
stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.fasta
mothur >
seq.error(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table, reference=HMP_MOCK.v35.fasta, aligned=F)
[ERROR]: stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.fasta is blank, aborting.
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.fasta as input file for the fasta parameter.
Unable to open stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table. Trying input directory C:\Users\glory\Desktop\project\Mothur.win\mothur\stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table.
Unable to open C:\Users\glory\Desktop\project\Mothur.win\mothur\stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table. Trying MOTHUR_FILES directory C:\Users\glory\Desktop\project\Mothur.win\mothur\stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table.
Unable to open C:\Users\glory\Desktop\project\Mothur.win\mothur\stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table. Trying mothur's executable directory C:\Users\glory\Desktop\project\Mothur.win\mothur\stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table.
Unable to open C:\Users\glory\Desktop\project\Mothur.win\mothur\stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table.
Unable to open stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.pick.count_table
[ERROR]: did not complete seq.error.
mothur >
remove.groups(count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table, fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.fasta, taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pds.wang.pick.taxonomy, groups=MockZymoPosP2C)
[ERROR]: MockZymoPosP2C is not in your count table. Please correct.
Removed 0 sequences from your count file.
mothur >
rename.file(fasta=current, count=current, taxonomy=current, prefix=final)
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.fasta as input file for the fasta parameter.
Using stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pds.wang.pick.taxonomy as input file for the taxonomy parameter.
Current files saved by mothur:
accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pds.wang.accnos
fasta=final.fasta
taxonomy=final.taxonomy
contigsreport=stability.contigs_report
count=final.count_table
processors=8
summary=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.summary
file=C:\Users\glory\Desktop\project\Mothur.win\mothur\stability.files
mothur >
count.seqs(count=final.count_table, compress=f)
Output File Names:
final.full.count_table
Can you use end=18929
in the second screen.seqs
step? You need the earliest desirable end position. Very few of yoru sequences end after that position, but you are requiring them to end at position 18931.
Pat
I used it to the end and the numbers of the taxonomy and the count table are not the same. Since for this set of sequences they didn’t use the mock community, i skipped those commands and then i decided to combine the count and taxonomy file
It seems weird that your sequence names and counts would be the same and that you are getting different classifications. In general, I strongly discourage using get.oturep and instead encourage people to use classify.otu to get a consensus classificatoin for each OTU.
Pat
this is before classifying into OTU
this is the final. taxonomy and final.count_table not the cons. taxonomy file
Unless you’re doing something specail, I wouldn’t trust the results. It doesn’t makes sense that columns a, b, and c would have the same values in each row. If you are getting an error message some where upstream of this point, you need to go back and rerun things in a way that doesn’t generate the error message.
Pat
I combined the results and usually the numbers in columns e, f which are from the count table are usually the same as in columns h, I from the taxonomy file
also, I tried to put the miseq sequences and ran the same code and it filtered out the miseq sequences. how can I properly analyse both sets of sequences?
You would need to process the NextSeq and MiSeq samples in parallel. It’s easiest to combine them at the make.contigs stage. They also need to be sequences for the same region. If you sequenced different regions with the two chemistries then you need to analyze them separately.
Pat