How to work with just forward reads?

Kumari_Richa · December 21, 2016, 4:24pm

Hi,

I have paired end reads from Illumina Miseq. I want to analyze my data using just forward reads. Is this possible using Mothur? If Yes,

how can I start by bypassing make.contigs? How should the first input file (K/A stability.files in MiSeq SOP) be created?
how should I create the “primer.oligos” file to trim the forward primer attached ? Given that adapters and barcodes were already removed from my dataset so I needed to remove just the primers from my dataset when I used both forward and reverse reads for my analysis- in this case, I used “primer.oligos” file created in the following way-

primer GTGCCAGCMGCCGCGGTAA CCGYCAATTYMTTTRAGTTT

Looking forward for suggestions.
Thanks a lot
Richa

pschloss · December 22, 2016, 1:34pm

Hi Richa,

You can do this and would probably want to follow the quality score approach in the 454 SOP (https://mothur.org/wiki/454_SOP). That tutorial lays out examples for your oligos file. Give that a shot and let us know if you get stuck.

Pat

Kumari_Richa · January 2, 2017, 12:09pm

Hello Dr. Schloss,
Thank you very much for your reply. I still have some doubts on how to start with because I have my data as .fastq files and not as .sff files. So, I could not understand how to create the first input file with just forward sequences and which should be the “first command line” to enter the data into mothur. Kindly suggest how should I proceed.

Looking forward for your suggestion,
Thanks
Richa

pschloss · January 4, 2017, 1:38pm

If you run fastq.info you’ll get a fasta and qual score file, which is what you’re looking for.

Pat

Kumari_Richa · January 27, 2017, 1:22pm

Hello Dr. Schloss,

Thanks for your reply.
I am trying to understand 454 SOP given in https://mothur.org/wiki/454_SOP as you suggested. But what I am not able to get is-

After running fastq.info for forward sequence fastq files, I will have Fasta+qual files for each sample separately. Now I want to proceed with further screening steps e.g. removing the forward primer attached to the sequence, min & max length, etc . So, how how should I make the input file that contains names of each sample that can be processed together?
Also, what command will be useful in my case for creating a reverse complement of the forward sequences that I have to use for my analyses; And at which step should I use it? Since I do not have “flow data”, and the given SOP has used the screening steps (that simultaneously created reverse complement for each sequence) written below, I am not understanding how to proceed-

mothur > trim.flows(flow=GQY1XT001.flow, oligos=GQY1XT001.oligos, pdiffs=2, bdiffs=1, processors=2)
mothur > shhh.flows(file=GQY1XT001.flow.files, processors=2)

Sincerely looking forward for further suggestions. Thank you very much for help.

Richa

Kendra · January 27, 2017, 3:17pm

use trim.seqs not trim.flows (flows are 454 “raw” data). You’ll also need to skip sshflows

Kumari_Richa · January 27, 2017, 3:36pm

Hi,
Thanks for reply. Actually, this is what I am trying to understand that since I have to use “trim.seqs”, how should I-

make the input file that contains names of each .fasta file that can be processed using trim.seqs and other commands afterwards?
Previously, I have been following MiSeq SOP, in which make.contigs step needs an input file (named as stability.file in MiSeq SOP) which contains names of forward and reverse fasta files because its main job is to make contigs but it creates a single file with sequences from all the samples together which can be further processed together. But currently, I am trying to analyse my data using only forward sequences. So, how can I start?
Since I will be using “trim.seqs”, which command can make reverse complement of the forward sequences ? And after which step I should use this command?

I apologize for repeating same questions, but I am at the learning phase. Sincere thanks for all the suggestions. Looking forward for reply.

Richa

samche42 · January 24, 2024, 1:10pm

Hi!
I know this is old but I’ve recently had to do this so I thought I’d share my approach:

I have hundreds of files, so I wrote a python script to generate the first few steps, which is basically for every *.fastq file in the current directory, create the following steps:

fastq.info(fastq=sequence_file.fastq)
trim.seqs(fasta=sequence_file.fasta, qfile=sequence_file.qual, qwindowaverage=20, minlength=250, processors=16) #Change these to suit you!
Merge the *.trim.fasta files with merge.files
Merge the count table files with merge.files

That then outputs what I call batchfile1. I run that and I end up with ABC.trim.contigs.fasta and ABC.trim.count_table. I then run batchfile2 which runs through all my normal steps from the MiSeq approach.

I’ve hosted both the python script to generate batchfile1 and the subsequent batchfile2 on Git here. Hope this helps some people!

Topic		Replies	Views
Stumped on getting started Commands in mothur	1	3201	September 17, 2014
help with make.contigs/illumina MiSeq data Commands in mothur	3	4097	March 3, 2014
How to use only forward/reverse reads for MiSeq? Commands in mothur	1	2825	December 18, 2014
Analyzing MiSeq data with only R1 and Index reads Commands in mothur	1	2419	February 2, 2015
Not sure how to start in MiSEQ with this data... Commands in mothur	3	1836	February 16, 2016

How to work with just forward reads?

Related topics