Sequence length

Probably not, but the convention in the field is to remove primers before analyzing the sequences.

Thanks for your help again Pat, just trying to get my head around this.

Surely if the primers were included for all 251bp (i.e. 231bp excl primers) sequences in make.contigs, the amplicon would measure 273bp not 292bp?

–20–{------------233------------}–20-- (273bp)

Or is it the case that

–35–{------------220------------}–35-- (290bp)

Here is a merged read with amplicon length of 291bp (there are two N’s in the sequence as haven’t run screen.seqs yet)

M01822_319_000000000-AG4CF_1_1101_16954_1171
GTGCCAGCCGCCGCGGTAATACATAGGATGCAAGCGTTATCCGGATTTACTGGGCGTAAAGCGAGCGCAGGCGGATTTACAAGTCTGATGTTAAAGACAACTGCTTAACGGTTGTTTGCATTGGAAACTGTAAGTCTAGAGTATAGTAGAGAGTTTTGGAACTCCATGTGGAGCGGTGGAATGCGTAGATATATGGAAGAACACCAGAGGCGAAGGCGAAAACTTAGGCTATAACTGACGCTTAGGCTCGAAAGTGTGGGNAGCAAATAGGATTAGATACCCCGGTAGTCN

I have looked at the make.contigs report file and it seems to report that the following (if I am understanding correctly);

Length = 291bp
Overlap length = 211 bp
Total primers = 40bp

Therefore, is the read length 251bp, but merged read length 291bp (as forward and reverse primers included)?

What I don’t understand is that each primer length is 20bp, so should the amplicon not be 271bp?

I know I have to remove the primers, but just trying to understand this.

Any help would be greatly appreciated. Thank you

The V4 region, within the primers is ~250 nt. So tacking on an additional 40 nt would get you to 290.

Pat

Thanks Pat

I got that but both the forward and reverse sequences are 251bp in length (including the primer), therfore 231bp excluding.

The only way I can see the sequence extending to a 291bp amplicon is if there is 211bp overlap, 20bp unique to each of forward and reverse sequences (20 bp each), and then the primers (20bp each). Therefore 291bp total.

Would this make sense?

How come no oligo file is used is during make.contigs step in the MiSeq SOP? There’s no mention that primer sequences were removed beforehand.

1 Like

Using the Kozich (or Caporasso) protocol, custom sequencing primers are used that match the v4 target primers. The end of the primer is the beginning of the sequence.