Sequence Length

Hello, I have some followup questions on this topic. I similarly have V4 .fastq data from a 2x250 MiSeq run and it seems I too need to remove the primers (my reads were also assembling to ~290). I have 28 .fastq files for 14 samples and am following the MiSeq SOP. My oligos.txt file looks like this currently:

primer GTGYCAGCMGCCGCGGTAA GGACTACNVGGGTWTCTAAT
# BARCODE none none JG01
# BARCODE none none JG02
…etc.

make.contigs(file=stability.files, oligos=oligos.txt, processors=4)

Now reads assemble to the expected 253 bp length instead of 292, but I still have a few questions:

  1. Do I need to account for degeneracy (if that’s the correct term) in these primers by listing all the possible permutations in my oligos file? (e.g this post)

  2. make.contigs() throws a warning: * [WARNING]: your oligos file does not contain any group names. mothur will not create a groupfile. *. How do I work around this if barcodes are already removed and each sample is split into forward and reverse fastq files? Without a groupfile, do all the sequences end up in the same group, even if stability.files has the sample IDs?

  3. Could leaving the primers in previously be a cause for why my batch script was crashing during cluster.split() - perhaps the primers caused excessive unique sequences (per the blog post here)?

Appreciate any suggestions.