I am, suprise suprise, relatively new to 16S based analysis beyond Sanger. My goal is to perform a picrust analysis and I seem to loose a lot of OTUs on the way…:
What I did so far was to follow the MiSeq SOP successfully which resulted in a .biom file with around 3000 OTUs, of which only 700 occured more than 6 times across my samples.
Now I am trying to map these OTUs to green genes using:
classify.seqs(fasta=current, count=current, taxonomy=gg_13_5_99.gg.tax, reference=gg_13_5_99.fasta) classify.otu(taxonomy=current, count=current, list=current, label=0.03) make.biom(shared=current, label=0.03, reftaxonomy=gg_13_5_99.gg.tax, constaxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.pick.pick.an.unique_list.0.03.cons.taxonomy, picrust=97_otu_map.txt)
and the resulting .biom File only has 180 OTUs, of which 150 occur in more than 6 samples.
If I leave out the picrust option (picrust=97_otu_map.txt) I end up with similar numbers, as before, but the OTU IDs do not match to green genes.
From searching the forum I gathered that duplicated OTUs are removed for picrust, but I doubt that this would be the case for ~2000 of those. Were these simply discarded because they are not mapped to green genes in 97_otu_map and/or duplicates?