Overlapping V3-V4 using v2 2x250 bp

allan_santos · October 11, 2023, 1:22am

Dear all,
we have sequenced the v3-v4 region using the v2 kit 2x250 bp and the primers described by Klindworth et al. (2013). I could double-check after sequencing, and only the locus specific sequence was maintained in the generated fastq (F: CCTACGGGNGGCWGCAG and R: GGATTAGATACCCVHGTAGTC). The initial tails of the primers were automatically removed.
After make.contigs, the output of samples presented the following result:

     Start   End     NBases  Ambigs  Polymer NumSeqs
Minimum:        1       245     245     0       3       1
2.5%-tile:      1       400     400     0       5       24478
25%-tile:       1       440     440     4       7       244777
Median:         1       462     462     8       8       489553
75%-tile:       1       480     480     11      10      734329
97.5%-tile:     1       501     501     20      16      954628
Maximum:        1       502     502     63      217     979105
Mean:   1       460     460     8       9
# of unique seqs:       979105
total # of seqs:        979105

I’m not entirely sure if I need to remove this locus specific sequence of the primers or we could not get a fully (or minimal) overlapping.
How can I make a most confident overlapping in this case? How can I solve this to obtain a better coverage for the next steps of screening and filtering?

Thank you so much in advance!!

allan_santos · October 11, 2023, 3:38pm

I’ve just checked the entire sequence of the primers were included in the resulted file ‘trim.contigs.fasta’ after make.contigs.
Then, how can I solve that?

thank you

pschloss · October 18, 2023, 2:14pm

Hi Allan,

You need to include an oligos file to remove the primer sequences from the sequence in make.contigs. You aren’t going to be able to improve the amount of overlap between the reads since they are what they are.

In a subsequent step you need to use screen.seqs remove any sequences with an ambiguous base (maxambig=0) and that is longer than expected (perhaps maxlength=490?). You are likely to have most of your sequences removed because of these requirements.

I’d encourage you to consult this blogpost on the topic of the effects of having minimal overlap between reads

Pat

allan_santos · October 18, 2023, 6:25pm

Hi Pat,
thank you for your reply.
Yet, I’ve read your post about the large distance matrix. However, we only have 2x250 kit available and the library were prepared after amplifying the V3-V4 region, Unfortunately, I did not participate in this step for choosing the better alternatives.

Regarding the last run, we identified that the sequencing kit was out of date provided by Illumina, which surely resulted in the poor quality of sequencing. I could detect so many ambiguity and q score as a whole was very low.
Now, we will repeat it using another kit.

Thanks,

Topic		Replies	Views
ambiguous bases with v1-v2 region after make.contigs Commands in mothur	6	4864	February 11, 2014
Removing primers from 2x250 MiSeq V4 reads Commands in mothur	4	1507	June 17, 2019
All sequences removed with screen.seqs Commands in mothur	6	875	April 8, 2021
pcr.seqs start/stop for V3_V4 on MiSeq Theory behind mothur	7	3018	August 25, 2017
Several sequences missing primers and many ambiguous base calls Theory behind mothur	7	408	November 24, 2023

Overlapping V3-V4 using v2 2x250 bp

Related topics