align.seqs shortens some pyrosequences

igeorge · April 22, 2013, 1:10pm

Hi Pat,
I am using Mothur to analyze microbial diversity in mangrove sediment samples. I noticed that the command “align.seqs” shortens considerably some sequences (a few thousands out of 106000).
For example, in my all.trim.unique.fasta file, one sequence is AGTCCGGCTACCCATCAGAGCCTTGGTGAGCCGTTACCTCACCAACAAGCTAATAGGACATAGGCCGCTCCCCGGGCAGAGGGTTGCCCGACCGTTTACACTTCGGAAGATGCCATCCGAGGTGACCATCCGGTATTACCTGCCGTTTCCAGCAGCTATTCCGGTCCCGAGGGTACGTTGCCTATGTATTACTCACCCTTTCGCCGCTCTCCAGCACCCCGAAGGATGCCTTCGCGCTCGACTTGCATGCCTAAACCACGCCGCCAGCGTTCACT
(275 bp in total)
After alignment against the SILVA database (uploaded from the MOTHUR website):

align.seqs(candidate=all.trim.unique.fasta, template=silva.bacteria.fasta, flip=t, processors=2)

the same sequence looks like this (in the all.trim.unique.align file) (12 bp in total)
…CCAGCGTTCACT…

what is happening here? Should I use another reference database?
thanks for your help,
isabelle

pschloss · April 23, 2013, 12:30pm

I suspect these are mostly garbage sequences. If you take that sequence and blast it against the GenBank nt database (exclude uncultureds) you’ll get a bunch of awful alignments. The top match is only 76% identical to some members of the Planctomycetales at the 5’ end of your sequence. While it’s entirely possible that you have a new domain of life, given your alignment, it is unlikely. Also, when I blasted the last 200 bp of your sequence there were no significant matches. I’d remove these bad aligners using screen.seqs and move on without worrying.

Pat

Topic		Replies	Views
strange behaviour align.seqs with samples from same run Commands in mothur	5	4961	November 20, 2014
align.seqs deletes almost all bases Commands in mothur	4	1887	June 16, 2015
Alignment seqs of different lengths? Commands in mothur	4	4302	April 6, 2010
unaligned after align.seqs Commands in mothur	4	3983	February 8, 2011
alignment problem? mothur bugs	6	86448	December 2, 2009

align.seqs shortens some pyrosequences

Related topics