We’re trying to process 18S sequences (V1-V2 regions of the 18S rRNA gene, Primer Euk82F and Euk516F) with Mothur, version 1.48.0.
After alignment on the Silva.nr_v138_2.align database, we get too short sequences with 95% title containing 450 bp (the maxlenght given in screen.seq) before alignment and 25% tile containing 10 bp.
mothur > summary.seqs(fasta=18s.trim.contigs.fasta, count=18s.contigs.count_table)
Using 40 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 282 282 0 3 1
2.5%-tile: 1 399 399 1 4 86005
25%-tile: 1 445 445 6 4 860046
Median: 1 446 446 8 5 1720091
75%-tile: 1 450 450 11 6 2580136
97.5%-tile: 1 454 454 20 7 3354177
Maximum: 1 567 567 78 264 3440181
Mean: 1 444 444 8 5
# of unique seqs: 3440181
total # of seqs: 3440181
It took 112 secs to summarize 3440181 sequences.
Output File Names:
18s.trim.contigs.summary
mothur > screen.seqs(fasta=18s.trim.contigs.fasta, count=18s.contigs.count_table,summary=18s.trim.contigs.summary, maxambig=0, maxlength=450, maxhomop=8)
mothur > summary.seqs(count=current)
Using 18s.contigs.good.count_table as input file for the count parameter.
Using 18s.trim.contigs.good.fasta as input file for the fasta parameter.
Using 40 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 284 284 0 3 1
2.5%-tile: 1 293 293 0 3 930
25%-tile: 1 307 307 0 3 9297
Median: 1 309 309 0 4 18594
75%-tile: 1 312 312 0 4 27891
97.5%-tile: 1 422 422 0 7 36258
Maximum: 1 450 450 0 8 37187
Mean: 1 314 314 0 3
# of unique seqs: 37187
total # of seqs: 37187
It took 1 secs to summarize 37187 sequences.
mothur > align.seqs(fasta=18s.trim.contigs.good.unique.fasta, reference=silva.nr_v132.align)
mothur > summary.seqs(fasta=18s.trim.contigs.good.unique.align, count=18s.trim.contigs.good.count_table)
Using 40 processors.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 0 0 0 0 1 1
2.5%-tile: 1046 1056 4 0 1 930
25%-tile: 1046 1083 8 0 2 9297
Median: 43053 43116 11 0 2 18594
75%-tile: 43096 43116 16 0 3 27891
97.5%-tile: 43112 43116 422 0 6 36258
Maximum: 43116 43116 450 0 8 37187
Mean: 26606 27490 45 0 2
# of unique seqs: 29090
total # of seqs: 37187
It took 6 secs to summarize 37187 sequences.
In conclusion, we obtain sequences that are too short after alignment for the 18s
Thanks - I wonder if there’s a problem with your sequencing. I tried blasting your sequence against nt (blastn) and nr (blastx) and there were no significant hits.