classification

Hi mothur forum,

may I have a solution, if possible, on how to further
group sequences from an OTU that is defined by taxonomy,
Bacteria unclassified? I have the rep seq sequence but
like to know more about this big group of barely Silva v128
recognised sequences.

Sigmund

try blasting the rep seq against Ref_seq

Thanks, I did that (89% closest) and could do it for
all but it would have been easier if the
ca 6000 sequences could, somehow, have been
split into smaller groups so I could blast representatives
from each such subcluster. The sequences
are from different regions of the 16S so don’t think
they align well. The clustering by
phylotypes returned that big OTU classified
as Bacteria. I suspect this OTU includes
a lot of different sequences. Further comments
whould be helpful, on how to cluster unaligned
unclassified 16S sequences.
Sigmund

What are your samples? for things like soil, there won’t likely be many closer relatives in ref_seq. You just have to embrace the unknown

If you really want to try, you can cluster the unknown into a higher level OTU (say 5% rather than 3%) and blast those reps.

Thanks again for comment.
Samples are coral. That OTU is just Bacteria unknowns,
label 1. I could try for example label 3 with the phylotype command
and see what happens.

label 3 will be unknown

I am searching for a method that group 16S sequences
without using an alignment or the taxonomy.

why don’t you want to align them?

They do not all overlap in the same region because from different
primer sets. I like to compare sequences from different studies,
downloaded from the NCBI.

ah, then you are stuck with taxonomy. You could use the approach that qiime uses where they clustered a database then match seqs to it and report the database sequence. But if your sequences aren’t in a database that approach is out.