The amplicon is likely ~292 nt long. make.contigs makes the contigs, so if you have 2 250 nt fragments it will stagger them to optimize the sequence conservation in the overlapping area.
Correct, the only relationship that should be true for sequencing on miseq/hiseq is that your amplicon or fragment needs to be longer than the sequence length. You could technically sequence full length 16s on a 2x25 kit-you wouldnât get useful information out of that but the machine would still give you gigs of >q30 sequences. Illumina has decent videos on youtube that may help you understand whatâs going on with paired end sequencing.
For diverse amplicons you should be aiming for a lot of overla, read any of the posts about the subpar results people are getting with v1-3 sequencing on miseq to get an understanding of why non-overlapping q30 sequences are not good for amplicons.
Just wondering if anybody can help with the last question as to whether my seqs have merged correctly if theyâre showing up as 292bp length for a 251 PE reads?
I suspect that your contigs include your barcodes and primers. If you follow our wetlab SOP you donât get separate files for hte barcodes or even sequene the primers.
Thanks Pat, so you donât think that make.contigs has staggered in order to optimize the overlap in the V4 region and hance the amplicon appears as 292bp?
No, Iâve never seen a case where the majority of V4 contigs are anything but 250-255 nt. I suspect something else is going on here. Can you post the forward and reverse read for one of the sequences that assembles into a 290 nt contig?
So, in this case is mothur reading the merged amplicon as 292bp as the first ~20bp arenât overlapping, and so is staggering to focus on the sequence conservation in the overlapping area (i.e. 251bp)??
Sorry, I donât know what you mean by âstaggeredâ.
Your first read starts with GTGCCAGCCGCCGCGGTAA (19 nt) and your second read starts with the reverse compliment of your reverse primer - GGACTACACGGGTATCTAAT (20 nt). make.contigs will merge the two reads to assemble the contig and if you give it an oligos file with your forward and reverse primers it will remove them to give you a product that is ~253 nt long. We donât analyze the primer region since that sequence comes from the primer, not the bacterial DNA.
By âstaggeredâ I am referring to your previous comment on this thread (below).
The amplicon is likely ~292 nt long. make.contigs makes the contigs, so if you have 2 250 nt fragments it will stagger them to optimize the sequence conservation in the overlapping area.
Pat
If these reads are merged with the short primer reads still attached, will this cause classification issues?