Hi,
I follow the MiSeq_SOP and this post to create my customized db for the V1V2 region by using Silva 132. Everything seems to work well; however, I am not getting such consensus results as in the case of MiSeq_SOP (see the table at the end of this post, it is the result of summary.seqs). Especially, I am worried about the minimum and maximum value. Does this indicate some issue? Do I have to do some additional steps before aligning it to my sequences?
I did:
-
pcr.seqs(fasta=ecoli.16srrna.fasta, oligos= oligos_primers.txt)
oligos_primers.txt looks as follow
forward GTTYGATYMTGGCTCAG
reverse GCWGCCWCCCGTAGGWGT -
align.seqs(fasta=ecoli.16srrna.pcr.fasta, reference= head_silva.nr_v132.align)
Download the most updated version of Silva (132) “Full length sequences and taxonomy references”.
This file is pretty big so use head e.g. 29912 (this is the length of the Silva db included in MiSeq_SOP). -
align.seqs(fasta=ecoli.16srrna.pcr.fasta, reference= head_silva.nr_v132.align)
-
summary.seqs(fasta=ecoli.16srrna.pcr.align)
Results: start position is 1046 and end is 6333 -
pcr.seqs(fasta=silva.nr_v132.align, start=1046, end=6333, keepdots=F, processors=16)
-
rename.file(input=silva.nr_v132.pcr.align, new=silva132.V12.fasta)
-
summary.seqs(fasta=silva132.V12.fasta)
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 3929 214 0 3 1
2.5%-tile: 1 5286 273 0 3 5328
25%-tile: 2 5286 299 0 4 53280
Median: 2 5286 305 0 5 106560
75%-tile: 2 5286 317 0 5 159840
97.5%-tile: 2 5286 388 0 6 207792
Maximum: 74 5286 1047 5 15 213119
Mean: 1 5285 310 0 4
of Seqs: 213119
Many thanks, Joanna