mothur > align.seqs(fasta=stability.trim.contigs.good.unique.fasta, reference=silva.bacteria.pcr.fasta)
produced the following message:
[WARNING]: Some of your sequences generated alignments that eliminated too many bases, a list is provided in …flip.accnos. If you set the flip parameter to true mothur will try aligning the reverse compliment as well.
The flip accnos file has 32794 entries
I re-ran the same files and added flip=t
I got the same result and the new flip.accnos has the same 32794 entries in it. I went ahead and did
mothur > summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table)
And got
Start End Nbases Ambigs Polymer NumSeqs
Min 0 0 0 0 1 1
2.5% -tile 2 17012 42 0 4 100276
25%q-tile 2 17012 425 0 4 1002754
Median 2 17012 425 0 4 2005507
75%q-tile 2 17012 425 0 4 3008260
97.5%q-tile 16047 17012 425 0 6 3910737
Max 17012 17012 431 0 142 4011012
Mean 555.096 16926.3 405.843 0 4.23103
unique seq 1315681
#seqs 4011012
I don’t know if this looks ok or not. Do I just ignore the 32794 sequences that were unalignable? Reverse complementing didn’t solve the problem. Is this normal? (As an aside, in interpreting the above table, in the max row, how can sequences start at 17012, end at 17012, yet have 431 Nbases?)