Hi Pat,
We sequenced the V1-V3 region with the 27F and 519R primers with paired reads, samples were prepared on a two-step PCR process using dual index barcoding
I did not have the summary.seqs from the supercomputer run so I ran it again, here is a summary of the run:
mothur > make.contigs(file=stability.files, processors=4)
It took 6354 secs to process 10168461 sequences.
Group count:
BA 654157
BB 510041
N307 182221
N307S 320335
N313 118397
N313S 266486
N347 259538
N347S 163188
N348 380320
N348S 171741
N354 214481
N354S 330223
N355 236763
N355S 289916
N357 222860
N357S 132251
N362 233396
N362S 107407
N366 239303
N366S 143779
N368 275386
N368S 156147
N392 323389
N392S 182912
P332 164230
P332S 252076
P338 143252
P338S 232200
P339 103601
P339S 284827
P341 137143
P341S 167345
P344 138911
P344S 344584
P345 193173
P345S 225123
P346 200814
P346S 194425
P349 144669
P349S 291013
P358 137791
P358S 267076
P365 112373
P365S 319198
Total of all groups is 10168461
mothur > summary.seqs(fasta=stability.trim.contigs.fasta)
Using 4 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 35 35 0 2 1
2.5%-tile: 1 529 529 0 4 254212
25%-tile: 1 554 554 0 5 2542116
Median: 1 557 557 1 5 5084231
75%-tile: 1 560 560 3 6 7626346
97.5%-tile: 1 586 586 9 7 9914250
Maximum: 1 602 602 75 300 10168461
Mean: 1 555.543 555.543 1.99183 5.31778
of Seqs: 10168461
mothur > screen.seqs(fasta=current, group=current, maxambig=0, maxlength=600)
Using stability.trim.contigs.fasta as input file for the fasta parameter.
Using stability.contigs.groups as input file for the group parameter.
mothur > summary.seqs(fasta=current)
Using stability.trim.contigs.good.fasta as input file for the fasta parameter.
Using 4 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 35 35 0 2 1
2.5%-tile: 1 530 530 0 4 106145
25%-tile: 1 554 554 0 5 1061441
Median: 1 557 557 0 5 2122881
75%-tile: 1 560 560 0 5 3184321
97.5%-tile: 1 593 593 0 7 4139617
Maximum: 1 600 600 0 300 4245760
Mean: 1 555.959 555.959 0 5.23595
of Seqs: 4245760
mothur > unique.seqs(fasta=stability.trim.contigs.good.fasta)
4245760 4141571
Output File Names:
stability.trim.contigs.good.names
stability.trim.contigs.good.unique.fasta
mothur > count.seqs(name=stability.trim.contigs.good.names, group=stability.contigs.good.groups)
Using 4 processors.
It took 61 secs to create a table for 4245760 sequences.
Total number of sequences: 4245760
mothur > summary.seqs(count=stability.trim.contigs.good.count_table)
Using stability.trim.contigs.good.unique.fasta as input file for the fasta parameter.
Using 4 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 35 35 0 2 1
2.5%-tile: 1 530 530 0 4 106145
25%-tile: 1 554 554 0 5 1061441
Median: 1 557 557 0 5 2122881
75%-tile: 1 560 560 0 5 3184321
97.5%-tile: 1 593 593 0 7 4139617
Maximum: 1 600 600 0 300 4245760
Mean: 1 555.959 555.959 0 5.23595
of unique seqs: 4141571
total # of seqs: 4245760
Output File Names:
stability.trim.contigs.good.unique.summary
It took 122 secs to summarize 4245760 sequences.
mothur > trim.seqs(fasta=stability.trim.contigs.good.unique.fasta, oligos=wooster.oligos)
mothur > align.seqs(fasta=stability.trim.contigs.good.unique.trim.fasta, reference=silva.v1.fasta)
mothur > summary.seqs(fasta=current, count=current)
Using stability.trim.contigs.good.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.trim.align as input file for the fasta parameter.
Using 4 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: -1 -1 0 0 1 1
2.5%-tile: 2 12083 457 0 5 106145
25%-tile: 2 12083 488 0 5 1061441
Median: 2 12083 492 0 5 2122881
75%-tile: 2 12083 496 0 6 3184321
97.5%-tile: 12083 12083 534 0 231 4139617
Maximum: 12083 12083 534 0 231 4245760
Mean: 5.68305 9556.18 387.606 0 4.12603
of unique seqs: 3277357
total # of seqs: 4245760
mothur > screen.seqs(fasta=current, count=current, summary=stability.trim.contigs.good.unique.summary, start=2, end=12083, maxhomop=8)
Using stability.trim.contigs.good.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.trim.align as input file for the fasta parameter.
mothur > summary.seqs(fasta=stability.trim.contigs.good.unique.trim.align, count=stability.trim.contigs.good.count_table)
Using 4 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: -1 -1 0 0 1 1
2.5%-tile: 2 12083 457 0 5 106145
25%-tile: 2 12083 488 0 5 1061441
Median: 2 12083 492 0 5 2122881
75%-tile: 2 12083 496 0 6 3184321
97.5%-tile: 12083 12083 534 0 231 4139617
Maximum: 12083 12083 534 0 231 4245760
Mean: 5.68305 9556.18 387.606 0 4.12603
of unique seqs: 3277357
total # of seqs: 4245760
Output File Names:
stability.trim.contigs.good.unique.trim.summary
It took 2314 secs to summarize 4245760 sequences.
mothur > screen.seqs(fasta=stability.trim.contigs.good.unique.trim.align, count=stability.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.trim.summary, start=2, end=12083, maxhomop=8, processors=4)
Using 4 processors.
Output File Names:
stability.trim.contigs.good.unique.trim.good.summary
stability.trim.contigs.good.unique.trim.good.align
stability.trim.contigs.good.unique.trim.bad.accnos
stability.trim.contigs.good.good.count_table
It took 6329 secs to screen 3277357 sequences.
mothur > summary.seqs(fasta=current, count=current)
Using stability.trim.contigs.good.good.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.trim.good.align as input file for the fasta parameter.
Using 4 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 12083 422 0 3 1
2.5%-tile: 2 12083 459 0 5 100806
25%-tile: 2 12083 489 0 5 1008057
Median: 2 12083 492 0 5 2016114
75%-tile: 2 12083 497 0 6 3024171
97.5%-tile: 2 12083 534 0 8 3931422
Maximum: 2 12083 534 0 8 4032227
Mean: 1.56919 9480.35 383.293 0 4.06586
of unique seqs: 3067203
total # of seqs: 4032227
Output File Names:
stability.trim.contigs.good.unique.trim.good.summary
It took 1489 secs to summarize 4032227 sequences.
mothur > filter.seqs(fasta=current)
Using stability.trim.contigs.good.unique.trim.good.align as input file for the fasta parameter.
Using 4 processors.
Creating Filter…
Running Filter...
Length of filtered alignment: 1744
Number of columns removed: 10339
Length of the original alignment: 12083
Number of sequences used to construct filter: 3067203
Output File Names:
stability.filter
stability.trim.contigs.good.unique.trim.good.filter.fasta
mothur > unique.seqs(fasta=current, count=current)
Using stability.trim.contigs.good.good.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.trim.good.filter.fasta as input file for the fasta parameter.
3067203 2543453
mothur > pre.cluster(fasta=current, count=current, diffs=2)
Using stability.trim.contigs.good.unique.trim.good.filter.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.trim.good.filter.unique.fasta as input file for the fasta parameter.
Using 4 processors.
Using 4 processors.
Using 4 processors.
Using 4 processors.
Processing group P344S:
Processing group N354S:
Processing group BA:
Processing group N392:
104871 86582 18289
Total number of sequences before pre.cluster was 104871.
pre.cluster removed 18289 sequences.
It took 2461 secs to cluster 104871 sequences.
mothur > chimera.uchime(fasta=current, count=current, dereplicate=t)
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.fasta as input file for the fasta parameter.
mothur > remove.seqs(fasta=current, accnos=stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.uchime.accnos)
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.fasta as input file for the fasta parameter.
Removed 489152 sequences from your fasta file.
Output File Names:
stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.pick.fasta
mothur > summary.seqs(fasta=current, count=current)
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.uchime.pick.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.pick.fasta as input file for the fasta parameter.
Using 4 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 1565 422 0 3 1
2.5%-tile: 1 1565 457 0 4 59597
25%-tile: 1 1565 488 0 5 595969
Median: 1 1565 491 0 5 1191938
75%-tile: 1 1565 493 0 5 1787906
97.5%-tile: 1 1565 497 0 6 2324278
Maximum: 1 1565 521 0 8 2383874
Mean: 1 1565 488.538 0 5.18417
of unique seqs: 1702672
total # of seqs: 2383874
Output File Names:
stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.pick.summary
It took 107 secs to summarize 2383874 sequences.
mothur > classify.seqs(fasta=current, count=current, reference=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax, cutoff=80)
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.uchime.pick.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.pick.fasta as input file for the fasta parameter.
It took 16986 secs to classify 1702672 sequences.
It took 69 secs to create the summary file for 1702672 sequences.
Output File Names:
stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.pick.pds.wang.taxonomy
stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.pick.pds.wang.tax.summary
mothur > remove.lineage(fasta=current, count=current, taxonomy=current, taxon=Chloroplast-Mitochondria-unknown-Archaea-Eukaryota)
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.uchime.pick.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.pick.fasta as input file for the fasta parameter.
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.pick.pds.wang.taxonomy as input file for the taxonomy parameter.
mothur > cluster.split(fasta=current, count=current, taxonomy=current, splitmethod=classify, taxlevel=4, cutoff=0.15)
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.uchime.pick.pick.count_table as input file for the count parameter.
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.pick.pick.fasta as input file for the fasta parameter.
Using stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy as input file for the taxonomy parameter.
Using 4 processors.
Using splitmethod fasta.
Splitting the file…
/******************************************/
Running command: dist.seqs(fasta=stability.trim.contigs.good.unique.trim.good.filter.unique.precluster.pick.pick.fasta.0.temp, processors=4, cutoff=0.155)
Using 4 processors.
/******************************************/