Hi all - Illumina is making the MiSeq – acording to their own words – “obsolete”. The MiSeq i100, its replacement, is one of Illumina’s “two-color” (2-channel) instruments. These instruments just use two dyes to identify the 4 bases (A/C/G/T). As a result, no signal from either dye is read as a G, which leads to long stretches of poly-G calls with very high quality scores. My assumption is that minimal changes to the current SOPs are required to analyze i100 data, given mothur’s ability to aggressively filter homopolymers, but I would very much welcome your thoughts.
Hey pad - thanks for your post. I had heard grumblings about this, but have yet to get any data. If anyone sequences a mock community with the i100 I’d love to take a look. But for now, I’d assume that the previous SOP should work well.
Pat
The other issue I have found with the two color system is intense tag jumping. I would recommend you to do a careful sanity check afterwards before going deep into analysis.
I see - thank you for the heads-up. So far I have not seen this based on co-sequenced positive and negative controls. How did you reach the tag-jumping conclusion?
I realized that the top SV of each region (we were comparing regions) were also in all other regions. When it would be impossible. Since we were comparing to replicates made in MiSeq, the problem was quite clear, miseq did not have the issue. Important to have negative or mock controls there.
I’ve been doing more work on this and had a question for @leocadio. I’m not sure I follow the mechanism for tag umping with amplicon data. The Kozich way of generating an amplicon library would be to do 384 separate PCRs with combinations of primers such that there are 16 forward primers and 24 reverse primers (or vice versa). The PCRs are performed, cleaned, and pooled using a step with a normalization plate that involves washing the products. Often the PCR pool is cleaned either with something like amppure or gel purification. All of this would seem to get rid of any extra index sequences that are not incorporated into the PCR product. From there, I’m not sure how one gets an index sequence to “jump”.
It makes a little more sense to me for genomic libraries where primers are ligated onto the fragments followed by a low number of PCR cycles step. But with PCR (doing it the Kozich way) the indices are in the same oligo as the adapter and 16S primer.
I feel like I must be missing something.
Thanks!
Pat
Hi Pat,
Apparently is a problem of the patterned flowcells used in the NextSeq and other non-MiSEq, and the two color instead of the four color. We learned the hard way (although it was great for the paper since it was a cross-lab experiment!! And the one using next seq was terrible.
During the analyses, I went into a search of what could have happened, and came into several of articles like this:
So you saw it as a problem with amplicon data and not just with genomic sequence data? Do you have a link for your paper?
Thanks!
Pat
Yes, metabarcoding data. Paper below
https://onlinelibrary.wiley.com/doi/full/10.1111/1755-0998.70090
The sequencing center PROMISED that there was no cross contamination during the procedures and that their blanks were clean. But, happened in FOUR different amplicons, and all across all samples, then a regular contamination (that happened all the time) was less plausible.
Happy to share the intermediate files if you would like (next week - I am traveling now and the final file in long-term storage is 250GB after gz).