Hello,
I used mothur to cluster my data and to extract representative sequences for distance=0.05.
After blast search on ncbi website i found that some representative sequences have same best homolog.
I aligned my representative sequences and calculated pairwise distances between them.
I found that many representative sequence pairs have distance far below 0.05. For example:
GCFD4AW02IZARV|45|83 GCFD4AW02G509M|63|20 0.006061
GCFD4AW02IZARV|45|83 GCFD4AW02F322M|40|24 0.006061
GCFD4AW02IZARV|45|83 GCFD4AW02GBM51|52|49 0.006061
GCFD4AW02IZARV|45|83 GCFD4AW02HED3N|58|52 0.006024
GCFD4AW02IZARV|45|83 GCFD4AW02I1DPU|42|59 0.006024
GCFD4AW02IZARV|45|83 GCFD4AW02IYGDP|28|80 0.006061
How is it possible that these sequences are in different OTUs?
Here is the summary for my data after align.seqs, screen.seqs and filter.seqs commands:
mothur > summary.seqs(fasta=1810Ubac.good.filter.fasta)
Start End NBases Ambigs Polymer
Minimum: 1 412 145 0 3
2.5%-tile: 82 412 165 0 3
25%-tile: 82 412 165 0 4
Median: 82 412 190 0 4
75%-tile: 82 412 191 0 5
97.5%-tile: 82 414 193 0 6
Maximum: 82 492 220 1 9
of Seqs: 6817
Then i ran commands:
dist.seqs(fasta=1810Ubac.good.filter.fasta, calc=onegap, countends=F, cutoff=0.10, output=lt, processors=3)
read.dist(phylip=1810Ubac.good.filter.phylip.dist, cutoff=0.10)
cluster()
read.otu(sabund=1810Ubac.good.filter.phylip.fn.sabund)
get.oturep(phylip=1810Ubac.good.filter.phylip.dist, fasta=1810Ubac.fas, sorted=size, list=1810Ubac.good.filter.phylip.fn.list, label=0.05)
I use mothur v.1.11.0
Thank you.