Dear all,
we have sequenced the v3-v4 region using the v2 kit 2x250 bp and the primers described by Klindworth et al. (2013). I could double-check after sequencing, and only the locus specific sequence was maintained in the generated fastq (F: CCTACGGGNGGCWGCAG and R: GGATTAGATACCCVHGTAGTC). The initial tails of the primers were automatically removed.
After make.contigs, the output of samples presented the following result:
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 245 245 0 3 1
2.5%-tile: 1 400 400 0 5 24478
25%-tile: 1 440 440 4 7 244777
Median: 1 462 462 8 8 489553
75%-tile: 1 480 480 11 10 734329
97.5%-tile: 1 501 501 20 16 954628
Maximum: 1 502 502 63 217 979105
Mean: 1 460 460 8 9
# of unique seqs: 979105
total # of seqs: 979105
I’m not entirely sure if I need to remove this locus specific sequence of the primers or we could not get a fully (or minimal) overlapping.
How can I make a most confident overlapping in this case? How can I solve this to obtain a better coverage for the next steps of screening and filtering?
Thank you so much in advance!!