Representative seqs in name and count file

I was wondering how the ‘representative’ seqs in name or count file are picked ?

Asked because I encountered the issue like,
let’s say we have a group_A, which has seq_a1, seq_a2 (shown in the group file), but neither seqs are those ‘representative’ seqs. Then in the subsequent count file, under group_A, the sequence count is all 0, because only those ‘representative’ seqs are included in the count file. And that mislead me to think there were no seqs under group_A.

Maybe I missed something ?

Thanks !

If you’re using the abundance approach, it uses the most abundant sequence in the OTU. If you’re using the distance approach, it uses the sequence closest to the centroid of the OTU. In our experience, the distance-based approach is not practical with today’s datasets, but the abundance based approach is pretty close.


