Cluster/dist.seqs on filtered/unfiltered sqs=same OTU count?

I have a very simple question: having used the filter.seqs command with the vertical=true and trump=. options, my sequences end up getting shortened. Does this affect the distance matrix calculations (and the clustering into OTUs based upon it)? They used to be ~400 bp and are now around~250 - i.e. 10 bp difference between two sequences is now 2,5% and 4% difference - unless the identical columns removed by the filter are still somehow counted in the dist.seqs, too.

Are they? And if not, then what should one preferably do? When I filter my sequences and leave out the trump=., step, then pre.cluster takes forever.

Best regards,


I suspect the 10 nt difference over 400 nt is now about about 6 nt over 250 nt. If you are using 454 data to do this, then you may want to alter the settings in screen.seqs to make sure you get longer fragments.