Hi,
I was planning to pursue 16S V4 analysis on amplicon sequencing data using mothur. Basically, when I quality filter based on some simple set of criteria, I tend to obtain very low number of good quality reads per sample.
Conditions used to Quality filter Reads
Minimum length of read: 200
Minimum Q-value of each base position: 15
Minimum Mean Q-value of each read: 20
Maximum Ns allowed per read: 4
QC software used: prinseq/0.20.4
Sample Num_R1_before Num_R2_before Num_R1_after Num_R2_after
test_sample_01 600381 600381 16 16
test_sample_02 493191 493191 10 10
test_sample_03 435412 435412 8 8
test_sample_04 460862 460862 20 20
test_sample_05 567018 567018 4 4
test_sample_06 407389 407389 3 3
test_sample_07 549802 549802 6 6
test_sample_08 403641 403641 2 2
test_sample_09 292051 292051 3 3
test_sample_10 444006 444006 10 10
My Questions:
-
Is it a common experience to see such low quality sequences from Amplicon sequencing libraries on the Illumina MiSeq platform?
-
Based on the above observations ( QC outcomes ), can I reject these samples as being of very low quality not suitable for metagenomic analysis using the mothur pipeline.
-
Any specific suggestion(s) that I should observe while analysing such amplicon sequencing within mothur framework.
Thank you.