Good day
I am going over my first 16S dataset (I normally work with fungi) and am going over the Miseq SOP. I have a question about the alignment and pcr.seqs step. For the pcr.seqs command, I used the same start and end values as in the SOP, because I used the same reference file. I did however ad my primers used as an oligos file. Here is the code:
mothur > pcr.seqs(fasta=silva.bacteria.fasta, oligos=16S.oligos, keepdots=F, processors=8)
#I downloaded the silva file from mothur. It's just a general reference file for bacteria. The oligos file I made
with just the forward and reverse primers like this:
forward CCTACGGGNGGCWGCAG
reverse GACTACHVGGGTATCTAATCC
It took 36 secs to screen 14956 sequences.
Output File Names:
silva.bacteria.pcr.fasta
silva.bacteria.bad.accnos
silva.bacteria.scrap.pcr.fasta
mothur > rename.file(input=silva.bacteria.pcr.fasta, new=silva.v4.fasta)
summary.seqs(fasta=silva.v4.fasta)
mothur > summary.seqs(fasta=silva.v4.fasta)
Using 8 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 17013 382 0 3 1
2.5%-tile: 2 17014 402 0 4 339
25%-tile: 2 17014 405 0 4 3390
Median: 2 17014 425 0 5 6779
75%-tile: 2 17014 427 0 5 10168
97.5%-tile: 2 17014 428 1 6 13219
Maximum: 2 17063 469 5 9 13557
Mean: 1 17014 417 0 4
# of Seqs: 13557
mothur > align.seqs(fasta=stability.paired.trim.contigs.good.unique.fasta, reference=silva.v4.fasta)
It took 4159 secs to align 3144255 sequences.
[WARNING]: 712 of your sequences generated alignments that eliminated too many bases, a list is provided in stability.paired.trim.contigs.good.unique.flip.accnos.
[NOTE]: 305 of your sequences were reversed to produce a better alignment.
It took 4160 seconds to align 3144255 sequences.
Output File Names:
stability.paired.trim.contigs.good.unique.align
stability.paired.trim.contigs.good.unique.align.report
stability.paired.trim.contigs.good.unique.flip.accnos
mothur > summary.seqs(fasta=stability.paired.trim.contigs.good.unique.align, count=stability.paired.trim.contigs.good.count_table)
mothur > summary.seqs(fasta=stability.paired.trim.contigs.good.unique.align, count=stability.paired.trim.contigs.good.count_able)
Using 4 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 0 0 0 0 1 1
2.5%-tile: 2 17014 402 0 4 138038
25%-tile: 2 17014 403 0 4 1380373
Median: 2 17014 411 0 5 2760746
75%-tile: 2 17014 427 0 5 4141118
97.5%-tile: 2 17014 428 0 6 5383453
Maximum: 17063 17063 454 0 118 5521490
Mean: 11 17012 414 0 4
# of unique seqs: 3144255
total # of seqs: 5521490
It took 4969 secs to summarize 5521490 sequences.
Output File Names:
stability.paired.trim.contigs.good.unique.summary
Firstly, that warning message after align.seqs seems off and secondly, when looking at the summary.seqs for the output of align.seqs, the start and end of the sequences do not correspond with the set parameters in pcr.seqs. Could this be because I added the oligos file? Should I take out the oligos file in pcr.seqs and then remove the primers separately after I have done alignment?
Sorry if this is a basic question, but I haven’t done an alignment yet, so I would therefore kindly appreciate feedback regarding this matter.
Best
Nicolas