Hi I just wanted to confirm this with someone that my understanding of the pre.cluster step is correct:
Performing the pre.clustering step doesn’t actually remove or change any sequence, it just “preclusters” them with a unique sequence, so the subsequent processing “considers” it as 100% similar to the “merged” unique sequence, but if you go back to look at the DNA sequence in the fasta file, the sequences are still different.
The reason I ask is that after looking at the DNA sequences of some sequences from the same OTU (~100 bp, called at 97%), I see some sequences that have 4 - 5 mismatches. I think this is mostly likely due to how the pre.clustering step was carried out, but I just wanted to confirm that my thinking and understanding of the step is correct.
FYI, I set the diffs = 1 for pre.cluster.
Thank you so much!