sub.sample before OTU clustering?

Hi all,

I am working on MiSeq data following the MiSeq SOP. I got too many unique sequences thus not able to use the traditional dist.seqs and cluster approach. I am using cluster.split right now but don’t think it’ll work as it has been running for days and is still running. My last hope is phylotype-based approach which I do not want to use until I tried everything else and fail. In the MiSeq SOP, there is a sub.sample step after OTU clustering and right before alpha diversity analysis. So my question is, can I do a sub.sample before getting distance matrix and picking OTU? Is it logic to do this? I know this will reduce the total number of sequences, but will this reduce the number of unique sequences before getting distance matrix and picking OTUs? Thanks in advance.


I guess you could try this, but it really isn’t desirable. When you run cluster.split you might try using taxlevel=5 (or 6) and use classic=T. This might speed things up a little.


Thanks a lot Pat! I think I’m gonna try set taxlevel=5 with cutoff=0.05 or taxlevel=6 with cutoff=0.03 and see how they work.

Deng Pan