Mock community error rates with seq.error

FM_Kerckhof · March 11, 2015, 3:09pm

Hi,

Currently I am processing a mock community that was sequenced in triplicate. It consists of 19 different species from 6 phyla. Some are quite closely related, some are more divergent. They were pooled “evenly” based upon 16S copy number as determined with qPCR and are all fully sequenced strains. To verify purity sanger was performed on the genomic extracts.
Now I’m in a bit of a pickle on how to get some error rates here. The seq.error documentation page is rather poorly documented (http://www.mothur.org/wiki/Seq.error) and I would like to know how I should proceed.

What did I try so far:

Make a fasta file with the partial 16S of each (type) strain from either RDP or EMBL repositories
Align this fasta with the SILVA v119 reference alignment
Use pcr.seqs to trim to primer region of my actual mock data
Run filter.seqs
Run mock data through MiSeq SOP until point of error comparison

And here starts the pickle. I am supposed I have to set aligned=T ? Does this mean that both the fasta and the reference have to be aligned? I tried it and I get a whole bunch of errors complaining that they are not the same length. This is quite weird as I used pcr.seqs to trim to the correct primer region, should I include an extra screen. seqs after step 3? Is it okay to use the reference sequences from the databases, or would it be a better idea to use the sanger reads from the strains that we’ve put in (which are, theoretically, the same).

Comments & suggestions are warmly welcome

Kind regards,

FM

pschloss · March 13, 2015, 12:50pm

If you’ve filtered your data without including the mock community data and then filtered the mock without the real data, then you will likely have very different alignments. When you filtered your data, it should have generated a filter file ending in *.filter. Use that on your data as we did here:

http://www.mothur.org/wiki/454_SOP#Error_analysis

Alternatively, you could just use aligned=F and it will strip out all of the gaps from both datasets and the error messages should go away.

Topic		Replies	Views
how does seq.error work? Theory behind mothur	2	2101	April 14, 2016
Reference file for seq.error Commands in mothur	1	1531	September 3, 2015
Errors following align.seqs Commands in mothur	8	670	January 26, 2024
seq.error Commands in mothur	11	10468	May 13, 2013
seq.error and chimera detection Commands in mothur	2	1048	March 30, 2017

Mock community error rates with seq.error

Related topics