I’d like to create heatmaps of all phyla I found separately. Now, at an 80 % cutoff, I have a high amount (32 %) of unclassified OTUs at phylum level, which however, are scattered throughout the tree (within phyla), or even form large parts of it themselves.
(How) Could I extract this phylogenetic data including the unclassified sequences/OTUs?
If you run phylotype and pick label=6, then you’ll have the phylum level or label=1 then you’ll have the genus level. If something doesn’t classify, it is still included in the analysis. For example if you want label=1, but something doesn’t classify to the genus level, then it becomes an unclassified genus from that family and is counted as a phylotype. Hope this helps.
But unclassifieds at the phylum level will not be distinguishable from each other, right? That’s what I’m looking for. They are scattered throughout the tree clustered within different phyla, and some branches even consist solely of unclassifieds, i.e. potential new taxa? I’d like to extract these phyla including these unclassified sequences.
E.g. the blue is Planctomycetes with some unclassifieds, the greenish is a branch of only unclassifieds.
(on a side, how could I check if these are really biologically relevant, and not bad sequences? Would they so obviously cluster together if it really were bad sequences?)