filter.seqs removing sequences from the fasta file?


I am processing a downsampled dataset to find the best parameters for my analysis.
After running screen.seqs I ended up with:

of unique seqs: 8794

total # of seqs: 10129

Then, I ran
mothur > filter.seqs(fasta=454downsample.shhh.trim.unique.good.align, vertical=T, trump=., processors=2)

and the output was:
Length of filtered alignment: 1219
Number of columns removed: 48781
Length of the original alignment: 50000
Number of sequences used to construct filter: 7372

Someone can explain why filter.seqs started with 8749 unique seqs but ended up in a fasta file with 7372 seqs? I thought that filter.seqs would only remove common gaps and missing data but not sequences!


I suspect the data you posted were from running summary.seqs on 454downsample.shhh.trim.unique.align and not 454downsample.shhh.trim.unique.good.align