Hi @ all!
First of all, I’m a real noob to NGS amplicon analyses and not a bioinformatician :roll:
Well, let’s come to my problem. We have amplified the 18S V9 region of waste water samples using 18S primers ( Euk1391f, EukBr from Amaral-Zettler 2009) and a 2-step barcoding approach. Now we have sequenced them (as a rapid test) using HiSeq HighOutput 75bp SE run settings to get about 2 Mio reads per each sample. The sequencing worked quite well, quality ok, not much overrepresented seqs.
However, by doing a rather quick screening using VAMPS software I figured out that almost half of the reads represent bacterial contaminants and another 1/4 is unknown seqs…
As I understand correctly, to do an OTU cluster analysis in mothur I need aligned reads, which doesn’t make too much sense with a mix of 16S and 18S seqs, right!? However, if I’m doing an alignment in mothur on all reads using the silva seed database I get an alignment, but after applying command filter.seqs to remove “dots” from the alignment (trump=., vertical=F) the resulting alignment length is 0. Is this due to the non-well aligned reads resulting in the removal of all positions with a gap? So, I stuck here in the analysis and need some advice how to proceed further to finally get the OTU cluster pie chart…
Thanks for your help!