Single end read alignments

dgamero · November 22, 2021, 7:32pm

I have a data set of single end reads based on the 16S v4 region. The reason they’re single end is I only have access to an iSeq instrument with 2x150 bp reads and the 16S v4 region nears 300 bp including primer sequences.

I’ve managed to process the data using the miseq protocol with some adjustments for a single fastq file (namely, using fastq.info and make.group commands). This works great. My question is about the post-alignment screening portion of the protocol. So far I’ve prepared a pipeline that screens the F primer fragment of the amplicon for alignment using start=11895 and end=21278, but I imagine this excludes the R primer fragment from analysis. Does it make sense to analyze each fastq file twice, once for the F fragment (start=11895 and end=21278) and again for the R fragment (start=21340, end=25434)? Or am I not thinking about this correctly.

pschloss · November 30, 2021, 5:52pm

Hi,

If you only have access to 2x150 data, then I’d suggest tossing the second read. It’s going to be a higher error rate than the first. With the first read, I’d encourage you to use the phylotype-based approach we describe in the MiSeq SOP.

Pat

dgamero · November 30, 2021, 7:24pm

Hi Pat,

Thank you for your reply! I am only analyzing read1 in this case. My question is based on my understanding of adapter-index ligation, such that read1 will contain a mix of sequences primed off either the F or R 16S primer. If that is correct, then I believe I can perform the mothur alignment for the F primer and R primer fragments separately and then append the lists to obtain the full data set provided by read1.

I will consider phylotype analysis based on your suggestion, as well.

pschloss · December 2, 2021, 7:52pm

Hi,

Sorry, I’m really not sure how they built the libraries. If half of the R1 reads are forward and half are reverse, then I would strongly encourage you to only work with R1 and pick a direction to use the reads from. The first read is better than the second and for both reads the proximal end of the read is better than the distal. If you are comparing different ends from different reads that seems like it could be quite a headache. You could certainly use align.seqs, summary.seqs, and screen.seqs to figure out what should go with what.

Pat

Topic		Replies	Views
Single end read processing	3	649	December 17, 2020
analyis of single end data Commands in mothur	4	1004	November 24, 2016
Analyzing MiSeq data with only R1 and Index reads Commands in mothur	1	2414	February 2, 2015
Processing paired end reads separately Commands in mothur	3	1663	May 16, 2016
Miseq, long reads vs short reads Theory behind mothur	2	4893	August 13, 2014

Single end read alignments

Related topics