MiSeq data has way too many reads

biomickwatson · November 18, 2016, 1:44pm

Hi

I have some MiSeq data, paired-end, and am following the MiSeq SOP.

I clearly have too many sequences - over 2million unique sequences after contig joining and basic QC (screen.seqs), and over 600k after chimera removal.

So I have been looking at trim.seqs to try and remove low quality sequences. Which brings me to my question…

Should I run trim.seqs before or after make.contigs? Can I even run it on fastq directly?

What other steps should I be taking, that are not in the SOP, that will allow me to reduce the size of the dataset?

Cheers
Mick

Kendra · November 18, 2016, 4:14pm

16s v4? 2x250 run? you can up your pre.cluster(diffs=3) and cluster.split(taxlevel=4 or 5)

pschloss · November 19, 2016, 8:27pm

I haven’t seen anything to suggest that trimming before make.contigs would improve things. I’d check on what happened with the sequencing run (sequence a mock?) and/or do diffs=3 in precluster.

Pat

Topic		Replies	Views
Is trimming needed? Theory behind mothur	2	2522	September 4, 2018
Trimming ambiguous ends16S Commands in mothur	3	95	December 1, 2025
make.contig tweaks Commands in mothur	2	1351	January 19, 2016
Processing MiSeq single (unpaired) reads Commands in mothur	3	4187	April 27, 2017
QC with trim.seqs before make.contigs Commands in mothur	1	1885	February 12, 2016

MiSeq data has way too many reads

Related topics