Issue with classify.otu

gkuffel · March 16, 2017, 4:58pm

Hi again,

We are studying the bladder microbiome in humans to give some context. Specifically we target the V4 region of the 16S gene. When I run the following command:

classify.otu(list=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.an.unique_list.list, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.count_table, taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy, label=0.03)

I get the following message:

Your file does not include the label 0.03. I will use 0.01.
0.01 102834

Can someone please explain this to me, does this mean there are not any sequences that are at least 3% different? Is this an error of some kind? Any insight would be appreciated. Thank you again.

Kendra · March 16, 2017, 8:13pm

without seeing your previous commands, it’s not possible to answer that question

gkuffel · March 16, 2017, 8:30pm

I am using the same commands listed in the MiSeq SOP. I have copied the contents of my bash script below:

make.contigs(file=stability.files, processors=8)
#this will read each fastq file in the stability.files and make the contigs. This can take some time, depending on how many files are in the stability file.

summary.seqs(fasta=stability.trim.contigs.fasta)

summary.seqs(fasta=stability.trim.contigs.good.fasta)

screen.seqs(fasta=stability.trim.contigs.fasta, group=stability.contigs.groups, maxambig=0, minlength=275, maxlength=300, processors=8)

unique.seqs(fasta=stability.trim.contigs.good.fasta)

count.seqs(name=stability.trim.contigs.good.names, group=stability.contigs.good.groups)

summary.seqs(count=stability.trim.contigs.good.count_table)

pcr.seqs(fasta=silva.bacteria.fasta, start=11894, end=25319, keepdots=F, processors=8)

system(mv silva.bacteria.pcr.fasta silva.v4.fasta)

summary.seqs(fasta=silva.v4.fasta)

align.seqs(fasta=stability.trim.contigs.good.unique.fasta, reference=silva.v4.fasta)

summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table)

screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary, start=1968, end=11550, maxhomop=8)

summary.seqs(fasta=current, count=current)

filter.seqs(fasta=stability.trim.contigs.good.unique.good.align, vertical=T, trump=.)

unique.seqs(fasta=stability.trim.contigs.good.unique.good.filter.fasta, count=stability.trim.contigs.good.good.count_table)

pre.cluster(fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, count=stability.trim.contigs.good.unique.good.filter.count_table, diffs=2)

chimera.uchime(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table, dereplicate=t)
#/usr/local/bin/uchime file does not exist. Checking path…
#[ERROR]: file does not exist. mothur requires the uchime executable.
#[ERROR]: did not complete chimera.uchime.

remove.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.accnos)

summary.seqs(fasta=current, count=current)

classify.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.count_table, reference=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax, cutoff=80)

remove.lineage(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.count_table, taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.taxonomy, taxon=Chloroplast-Mitochondria-unknown-Archaea-Eukaryota)

#the original commands have files with pick.pick.pick…I think the third pick is from the Error rate assessment and therefore took it out.

cluster.split(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.pick.count_table, taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy, splitmethod=classify, taxlevel=4, cutoff=0.03, processors=12)

Kendra · March 16, 2017, 9:09pm

I wonder if there’s something funky with cluster.split and mcc?? Try dist.seqs and cluster instead (see Classify OTU Labels for an example)

gkuffel · March 16, 2017, 9:14pm

Something else to note is that I have used this same exact script on other datasets and experience no issue. The resulting files are all at the 0.03 cutoff.

gkuffel · March 17, 2017, 3:35pm

Could this be happening because I am using an older version of Mothur? Are there any associated implications of using this data as I will not have time to run it through again before my deadline?

pschloss · March 20, 2017, 12:23pm

It sounds like you are using an old version of mothur, then you are going to be getting the average neighbor algorithm by default. I see that you set 0.03 as the cutoff. See this for a brief explanation for why…

https://mothur.org/wiki/Frequently_asked_questions#Why_does_the_cutoff_change_when_I_cluster_with_average_neighbor.3F

I’d encourage you to rerun the dist.seqs and cluter commands with the newest version of mothur using cutoff=0.03 to use the opticlust algorithm by default. This will run much faster than the previous methods and give you better results.

Pat

Topic		Replies	Views
Classify OTU Labels Theory behind mothur	4	1846	March 15, 2017
quesion in command classify.otu Commands in mothur	2	2445	August 12, 2013
Your file does not include the label 0.03. I will use unique. Commands in mothur	1	964	March 2, 2017
make.shared() doesn't recognize 'label=0.03' for MiSeq run Commands in mothur	3	4139	April 8, 2014
MiSeq SOP failing to generate 97% OTUs Commands in mothur	5	1067	June 15, 2017

Issue with classify.otu

Related topics