Make.Contigs with mixed forward and reverse

Hi all,

This is my first time with mothur, and I’m using someone else’s data, so it’s not in an ideal format.

Is there a way to make contigs from files that have both forward and reverse in them?
I downloaded my data in just one big fastq file, then I split it by sample into individual fastq files. But, the original file didn’t distinguish between forward and reverse sequences. So, I made the stability.files just contain each file twice, like so:

sample1 sample1.fastq, sample1.fastq
sample2 sample2.fastq, sample2.fastq
etc.

The contigs are really bad though - the mean number of ambigs is about 5. It’s V4V5, and each read is 150bp so it’s getting about 250 total, with 50bp overlap. The 75%-tile has a length of 295 though, which means my contigs are all way too long (would increasing the gap penalty help this?)

I couldn’t use ambigs=0 when I do screen.seqs, since this eliminated almost all my data.

After the align step I’m left with about 250,000 sequences, down from 4,000,000 initially. It’s about 1,500 sequences/sample, which is really poor coverage, isn’t it?

All of this makes me think that there is something not good about how I was doing make.contigs. Does anyone know if my method of using the same file for forward and reverse works when both forward and reverse are mixed into one file?

I also know that doing both the V4 and V5 does lead to worse alignment of contigs, since there’s less overlap, are these data normal? I had done the example MiSeq SOP and the quality of contigs was much much better, which is what makes me believe I’m doing something wrong.

We deal with the fastq file mixed with forward and reverse reads as well. Here is Dr. Pschloss’ suggestion, and it works perfectly.

http://mothur.ltcmp.net/t/how-to-separate-forward-and-backward-sequences-that-were-mixed-in-one-fastq-file/2766/3

Basically I make.contigs using the stability.files(according to the SOP), and run trim.seqs with the .oligos file. And the rest would be the same as SOP.

Thanks for the reply!

As it turns out, I’m a bit of a nitwit - I didn’t notice in the paper that they didn’t do paired end reads! I was trying to make contigs from things that just don’t go together I guess.

Did you by any chance get the sequencing results from Mr.DNA? In that case you can totally adopt our method to do data mining.