I’m trying to re-analyse some open access data. The authors have used a V1-V4 (I know, this isn’t the best approach at all!) primers and the DNA was tagmented before sequencing. This means that the length of the sequences and start and end positions are highly variable upon alignment. I wanted to use cluster with the vsearch algorithm on the fasta and count files, but was wondering if that might artificially inflate the number of OTUs on tagmented DNA?
If so, are there any good OTU-based alternative approaches to this analysis I could use, or will only something phylotype-based be suitable?