the period character shoud not appear mid-sequence, right??

I have used mothur to align 16S fragments to a 16S database and construct a consensus sequence. In a few cases, the consensus sequence contains one or many “.” characters mid-sequence. This does not jive with my understanding of what the period character denotes. anyone?


If the dominant base call in that position is a ‘.’ (i.e. missing data) this would make sense if your sequences don’t fully overlap with each other. Perhaps we need to re-evaluate how we represent the consensus sequence data.


I see now how a period is different from a gap (hyphen) or N. I don’t know of another convention for this. I think ncbi uses 99N’s to denote a somewhat similar idea (a gap of unknown length in a scaffold). I’ve edited the consensus.seqs page to (hopefully) make this more clear.