Hi, after I generate the representative sequence for each OTU, and use all these rep sequences to construct a database for blast.
And I randomly chose several rep sequences out to do the blast, the other rep sequences picked by blast is extremely high identical to the template sequence.
For example, my rep sequences are from the cutoff at 0.03, but after blast using one certain rep sequence, it finds several other rep sequences showing 99% identity to the original one.
I know this problem is generated because of the clustering, but average neighboring method is already the best available method to use.
I learned from the manual that you can classify them, and can combine the OTU classified as same genus or so. But the classified database is also limited, and I can’t get all my sequences classified at genus level and do the comparison. Actually that’s why we choose the OTU based analysis.
Is there anyway to bypass this issue ? Or any strategy to deal with this ?
My ultimate goal is to design the OTU specific primers, but with those extremely high similarity between rep sequences, I can not generate any specific primers.