Cluster sequence into OTUs

pschloss · January 3, 2015, 2:55am

I’m having a hard time believing that the output you posted. Can you please align your sequenes to silva.bacteria.fasta using flip=T and then post the output of summary.seqs? At a minimum the sequences should start at either 0 or 1044. Not 2

funpipi · January 5, 2015, 9:02am

As below, I used to run “dist.seqs” to get pairwise distance of seqs. However, I found the [ERROR]: your sequences are not the same length, aborting.
From the output from “summary.seqs”, I don’t understand what is the problem. Could anyone help me ??

mothur > summary.seqs(fasta=ECCK1_284.final.Prevotella.f.fasta,name=ECCK1_284.final.Prevotella.f.names)

Using 8 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 528 150 0 3 1
2.5%-tile: 1 713 184 0 4 1701
25%-tile: 1 1000 248 0 4 17010
Median: 1 1126 291 0 5 34020
75%-tile: 1 1259 402 0 5 51029
97.5%-tile: 1 1523 425 0 5 66338
Maximum: 1 1737 490 0 8 68038
Mean: 1 1134.1 309.825 0 4.61774

of unique seqs: 39219

total # of seqs: 68038

Output File Names:
ECCK1_284.final.Prevotella.f.summary

mothur > dist.seqs(fasta=ECCK1_284.final.Prevotella.f.fasta, cutoff=0.10, processors=2)
Using 2 processors. [ERROR]: your sequences are not the same length, aborting.

AlexPham · January 6, 2015, 6:37am

Dear Pat,
Before uploading the output file of summary. seqs command, after running align.seqs command, may be I have a small question. I wonder what is difference between the following command for alignment the sequences with silva.bacteria. fasta file. Which command I should you for my sanger sequences.

1.pcr.seqs(fasta=silva.bacteria.fasta, start=11894, end=25319, keepdots=F)

align.seqs(fasta=…fasta, reference=silva.bacteria.pcr.fasta)

align.seqs(fasta=…fasta, reference=…fasta)

3.align.seqs(candidate=…fasta, template=…fasta)
If I use the third one, here is output of summary.seqs command after I align my sequences with silva.bacteria.fasta file and set flip=T.
Start End NBases Ambigs Polymer NumSeqs
Minimum: 1044 37682 1084 0 4 1
2.5%-tile: 1044 37684 1098 0 5 2
25%-tile: 1044 40159 1122 0 5 14
Median: 1044 40339 1143 0 6 27
75%-tile: 6202 43116 1158 0 7 40
97.5%-tile: 6434 43116 1188 0 8 51
Maximum: 6458 43116 1197 0 10 52
Mean: 3421.52 41216.5 1141.94 0 6.21154

of Seqs: 52

I also wonder what is the role of parameter trump=. in filter.seqs command. I realize that different result have been observed for following statement:
filter.seqs(fasta=…fasta, trump=.)
filter.seqs(fasta=…fasta)
Thank you very much.

pschloss · January 9, 2015, 3:45pm

#2 and #3 are the same. #1 would make a new reference alignment for the region that you are interested in

I also wonder what is the role of parameter trump=. in filter.seqs command. I realize that different result have been observed for following statement:
filter.seqs(fasta=…fasta, trump=.)
filter.seqs(fasta=…fasta)

Please see filter.seqs

Topic		Replies	Views
Clustering OTUs Commands in mothur	5	1422	March 1, 2017
cluster Theory behind mothur	1	1951	June 29, 2015
Clustering sequences Theory behind mothur	2	2064	December 21, 2015
When to normalize by number of sequences? Theory behind mothur	1	1293	November 21, 2016
Clustering at 98% identity threshold level Commands in mothur	2	826	August 9, 2017

Cluster sequence into OTUs

of unique seqs: 39219

of Seqs: 52

Related topics