working with single end reads

Hello Mothur members,


I have been writing here as I am still learning mothur and also beacuse I faced disaster of getting some very bad quality reverse sequences. I targetted V4-V5, and after I saw the fast QC report, I realised my mistake. I am a PhD student and now I am not left with much time to ask the sequencing facility to do the sequencing again. So I decided to find a solution with what I have. I see that my forward sequences are pretty good and can be used for the analyses.

While going through the threads in the mothur forum, I came across this ‘’ Illumina single-read with index in 2nd sequencing run ‘’.
My question is, that for making contigs, I will have to generate reverse complement of the forward sequences. But is it important to begin with make.contigs step? Can I start directly with trim.seqs ???

Looing forward,
Richa

2 Likes

You shouldn’t have to get the reverse complement of your forward sequence and should be able to proceed with the trim.seqs step.

Thank you Dr. Schloss.

How do you performed make contigs with just forward sequence. can you please elaborate?

Thanks,
Rishikesh

Is there a way to use make.contigs on only the forward sequence with something like maxee = 2 to reduce the error rate in the forward reads? I am just curious on how to “save” these bad runs.

This is a repost of an older thread. I was doing something similar when I started with Ion Torrent data analysis SOP that do not exist anymore (ho boy that does take me back…)

But from this thread: Processing MiSeq single (unpaired) reads

" A basic workflow is

  1. Runfastq.info over each fastq file to split them into fasta and qual files.
  2. Quality filter them withtrim.seqs .
  3. Merge the QC-ed fasta files together withmerge.files to get your full fasta file.
  4. Create a groups file usingmake.group .
  5. Rununique.seqs to dereplicate the fasta file.
  6. Runcount.seqs over the resulting names file, and your groups file, to get the count table.

From there, you should be able to go back to the MiSeq SOP at the alignment step using the *.unique.fasta file and the count table."

1 Like

Thank you so much for a valuable reply. I will go through this and let you know that. I am able to do it or not.
doubt is trim.seqs for each fasta file?

Thank you
Rishikesh Dash

Hello! I think this is the way to go.

I’ve recently had to do this so I thought I’d share my approach (not to say that this is flawless by any means but hopefully helpful!):

I wrote a python script (because I have so many sample files) to generate the first few steps. In short, for every *.fastq file in the current directory, create the following steps:

  1. fastq.info(fastq=sequence_file.fastq)
  2. trim.seqs(fasta=sequence_file.fasta, qfile=sequence_file.qual, qwindowaverage=20, minlength=250, processors=16) #Change these to suit you!
  3. Merge the *.trim.fasta files with merge.files
  4. Merge the count table files with merge.files

This creates batchfile1 which can be run and generates ABC.trim.contigs.fasta and ABC.trim.count_table. I then run batchfile2 which runs through all my normal steps for MiSeq data (again, please check that the steps are right for you and the questions that you want to ask!)

I’ve hosted both the python script (to generate batchfile1) and batchfile2 on Git here.

Hope this helps some people!