Hi
I am processing a downsampled dataset to find the best parameters for my analysis.
After running screen.seqs I ended up with:
of unique seqs: 8794
total # of seqs: 10129
Then, I ran
mothur > filter.seqs(fasta=454downsample.shhh.trim.unique.good.align, vertical=T, trump=., processors=2)
and the output was:
Length of filtered alignment: 1219
Number of columns removed: 48781
Length of the original alignment: 50000
Number of sequences used to construct filter: 7372
Someone can explain why filter.seqs started with 8749 unique seqs but ended up in a fasta file with 7372 seqs? I thought that filter.seqs would only remove common gaps and missing data but not sequences!
Thanks!