Creating a customized reference alignment for V1-V2

Hi,
I follow the MiSeq_SOP and this post to create my customized db for the V1V2 region by using Silva 132. Everything seems to work well; however, I am not getting such consensus results as in the case of MiSeq_SOP (see the table at the end of this post, it is the result of summary.seqs). Especially, I am worried about the minimum and maximum value. Does this indicate some issue? Do I have to do some additional steps before aligning it to my sequences?

I did:

  1. pcr.seqs(fasta=ecoli.16srrna.fasta, oligos= oligos_primers.txt)
    oligos_primers.txt looks as follow
    forward GTTYGATYMTGGCTCAG
    reverse GCWGCCWCCCGTAGGWGT

  2. align.seqs(fasta=ecoli.16srrna.pcr.fasta, reference= head_silva.nr_v132.align)
    Download the most updated version of Silva (132) “Full length sequences and taxonomy references”.
    This file is pretty big so use head e.g. 29912 (this is the length of the Silva db included in MiSeq_SOP).

  3. align.seqs(fasta=ecoli.16srrna.pcr.fasta, reference= head_silva.nr_v132.align)

  4. summary.seqs(fasta=ecoli.16srrna.pcr.align)
    Results: start position is 1046 and end is 6333

  5. pcr.seqs(fasta=silva.nr_v132.align, start=1046, end=6333, keepdots=F, processors=16)

  6. rename.file(input=silva.nr_v132.pcr.align, new=silva132.V12.fasta)

  7. summary.seqs(fasta=silva132.V12.fasta)

            Start   End     NBases  Ambigs  Polymer NumSeqs
    

Minimum: 1 3929 214 0 3 1
2.5%-tile: 1 5286 273 0 3 5328
25%-tile: 2 5286 299 0 4 53280
Median: 2 5286 305 0 5 106560
75%-tile: 2 5286 317 0 5 159840
97.5%-tile: 2 5286 388 0 6 207792
Maximum: 74 5286 1047 5 15 213119
Mean: 1 5285 310 0 4

of Seqs: 213119

Many thanks, Joanna

Hi,

You might add a screen.seqs step like this…

screen.seqs(fasta=silva.nr_v132.pcr.align, start=2, end=5286)
rename.file(input=silva.nr_v132.pcr.good.align, new=silva132.V12.fasta)

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.