16S and 18S Sequence Mix for analysis!?

eXray · May 28, 2015, 5:14pm

Hi @ all!

First of all, I’m a real noob to NGS amplicon analyses and not a bioinformatician :roll:

Well, let’s come to my problem. We have amplified the 18S V9 region of waste water samples using 18S primers ( Euk1391f, EukBr from Amaral-Zettler 2009) and a 2-step barcoding approach. Now we have sequenced them (as a rapid test) using HiSeq HighOutput 75bp SE run settings to get about 2 Mio reads per each sample. The sequencing worked quite well, quality ok, not much overrepresented seqs.

However, by doing a rather quick screening using VAMPS software I figured out that almost half of the reads represent bacterial contaminants and another 1/4 is unknown seqs…
As I understand correctly, to do an OTU cluster analysis in mothur I need aligned reads, which doesn’t make too much sense with a mix of 16S and 18S seqs, right!? However, if I’m doing an alignment in mothur on all reads using the silva seed database I get an alignment, but after applying command filter.seqs to remove “dots” from the alignment (trump=., vertical=F) the resulting alignment length is 0. Is this due to the non-well aligned reads resulting in the removal of all positions with a gap? So, I stuck here in the analysis and need some advice how to proceed further to finally get the OTU cluster pie chart…

Thanks for your help!

dwaite · May 28, 2015, 7:56pm

My advice would to separate the 18S from 16S sequences and proceed from there, either discarding the 16S data entirely or analysing it as a separate data set. I’ve had a surprising amount of success extracting 16S rRNA sequences from metagenomic data just through using the classify.seqs command. You could run your full data through classify.seqs using the SILVA database (which contains 16S and 18S sequences) and split your data based on the classification output. Alternatively, I read a paper recently (Jervis-Bardy et al (2015) Microbiome 3:19) that used KRAKEN to screen their data for non-specific amplicons prior to 16S analysis.

eXray · May 29, 2015, 8:09am

The Jervis-Bardy paper is quite interesting, but I will first follow your classify.seqs recommendation to extract 18S reads only and try the OTU clustering on those reads, as they are of utmost interest. Thanks dwaite! But I’m still open for further advice

pschloss · June 1, 2015, 1:20pm

I would analyze 16S and 18S rRNA gene sequences separately as they really are two sets of questions. What’s going on with bacteria (and perhaps archaea) and with eukaryotes.

I’m also not sure how well 2x75nt reads will fare in the analysis. I think it’s unlikely that these reads will overlap, which will only complicate things.

Topic		Replies	Views
Extracting 16S data from metagenomic samples. Theory behind mothur	1	4086	March 21, 2013
cluster Theory behind mothur	1	1955	June 29, 2015
How to cluster the 16s DNA obtained from metagenomic sequenc Commands in mothur	2	2192	December 4, 2013
Adding 16S sequences from shotgun experiments Theory behind mothur	1	1221	June 19, 2017
18S reference alignment (and other bugs) mothur bugs	3	1076	June 3, 2020

16S and 18S Sequence Mix for analysis!?

Related topics