loss of diversity due to homogenization of sequences

I ran unique.seqs before aligning, screening and filtering a concatenated fasta file for the purpose of building a phylogeny. RAxML which I used, flagged a bunch of sequences as identical to other sequences in the file. I didn’t anticipate this since I already had run unique.seqs. It is worth mentioning that I use clone library sequences where each sequence corresponds to an individual genbank entry… I guess after the screening and filtering, some of the remaining overlapping sequences had been truncated at the same position- making them identical.

…Im just thinking of this loss in true diversity due to this homogenization… But I guess it’s the price to pay if you wanna have a somewhat accurate phylogeny.

You should run unique.seqs() after every filter.seqs() that can truncate ends. I think several steps of the SOP assume you have uniques only.