Reduction in number of sequences

Hi community!!! I am working with 16S sequence data in MOTHUR.

  • Initially I had 526345 sequences. After all “screen.seqs” and “remove.seqs” steps when I was heading for “classify.seqs” step I was having around 406528 sequences. The rests were removed in each quality screening and chimera removal steps.

  • initially, # of unique seqs: 172219 and total # of seqs: 526345

  • after preclustering, # of unique seqs: 94052 and total # of seqs: 526345

  • after chimera removal, # of unique seqs: 50148 and total # of seqs: 406528

Is such reduction in sequence numbers normal or I am doing something wrong?

loosing 20%o of your sequences through quality checking is completely normal


  • On a separate set of sequence I startted with 671453 sequences. Before chimera removal, I had 605417 sequences.After chimera removal, the number of sequence reduced from 605417 to 450840. That implies I have lost total 33% sequences of which around 26% sequences lost during chimera removal. Is it normal also?

  • Also during the remove.lineage step 260 sequences from fasta file and 1573 sequneces from count file was removed. Please make a comment on it also?

it is normal

the fasta only has unique sequences (so 260 unique were flagged). The count file is the total number of reads.