rabund file seens to report too many sequences

karinlag · April 27, 2011, 12:40pm

I have run through the following process on my sequences:

set.dir(output=.)
summary.seqs(fasta=@.%)
unique.seqs(fasta=@.%)
summary.seqs(fasta=@.unique.%)
pre.cluster(fasta=@.unique.%, name=@.names, diffs=2)
summary.seqs(fasta=@.unique.precluster.%)
pairwise.seqs(fasta=@.unique.precluster.%,calc=eachgap, countends=F)
read.dist(column=@.unique.precluster.dist, name=@.unique.precluster.names)
cluster(method=average)
rarefaction.single()
summary.single(calc=sobs-coverage-chao-ace-npshannon)

where @ is the full filename of the sequence file.

The cluster command produces, naturally enough, a rabund file. I started looking at this file to figure out how many of my OTUs were singleton OTUs, i.e. OTUs that were formed on the basis of one single sequence. In the process I discovered that it seems like the rabund file reports too many sequences. As an example:

The file control.fsa contains 48452 fasta sequences, when I count the number of > in the file. This is also the number I get when I get it into mothur and do the first summary.seqs on it.

Now I run through the commands for this file as described above. I then open the resulting rabund, and as a check I sum up the numbers from field no 3 and onwards. This should, if I have understood the documentation correct, represent all the sequences in the data set. However, in this case I get varying numbers - on the first line, the one that begins with unique, I in this case get the number 49806. Note, if I access any other line, I get another different number than that again.

Have I misinterpreted what the rabund file reports?

pschloss · April 28, 2011, 10:03am

Would you mind sending us control.fas to take a look at?

Pat

Topic		Replies	Views
Too many unique sequences before cluster.seqs	1	769	February 23, 2021
Names file vs count_table after pre.cluster() Commands in mothur	2	2812	November 20, 2013
cluster.seqs Commands in mothur	1	1863	June 3, 2013
Only "unique" given in *.an.rabund file Commands in mothur	0	2708	May 16, 2011
no equal numbers of sequences between name and group file mothur bugs	6	6865	May 5, 2012

rabund file seens to report too many sequences

Related topics