Single end read alignments

I have a data set of single end reads based on the 16S v4 region. The reason they’re single end is I only have access to an iSeq instrument with 2x150 bp reads and the 16S v4 region nears 300 bp including primer sequences.

I’ve managed to process the data using the miseq protocol with some adjustments for a single fastq file (namely, using fastq.info and make.group commands). This works great. My question is about the post-alignment screening portion of the protocol. So far I’ve prepared a pipeline that screens the F primer fragment of the amplicon for alignment using start=11895 and end=21278, but I imagine this excludes the R primer fragment from analysis. Does it make sense to analyze each fastq file twice, once for the F fragment (start=11895 and end=21278) and again for the R fragment (start=21340, end=25434)? Or am I not thinking about this correctly.

Hi,

If you only have access to 2x150 data, then I’d suggest tossing the second read. It’s going to be a higher error rate than the first. With the first read, I’d encourage you to use the phylotype-based approach we describe in the MiSeq SOP.

Pat

Hi Pat,

Thank you for your reply! I am only analyzing read1 in this case. My question is based on my understanding of adapter-index ligation, such that read1 will contain a mix of sequences primed off either the F or R 16S primer. If that is correct, then I believe I can perform the mothur alignment for the F primer and R primer fragments separately and then append the lists to obtain the full data set provided by read1.

I will consider phylotype analysis based on your suggestion, as well.

Hi,

Sorry, I’m really not sure how they built the libraries. If half of the R1 reads are forward and half are reverse, then I would strongly encourage you to only work with R1 and pick a direction to use the reads from. The first read is better than the second and for both reads the proximal end of the read is better than the distal. If you are comparing different ends from different reads that seems like it could be quite a headache. You could certainly use align.seqs, summary.seqs, and screen.seqs to figure out what should go with what.

Pat