make.biom contaxonomy error at higher tax-levels

Hi,

I wonder if these are expected errors given my input or due to something else…?

This works.

mothur > make.biom(shared=ss.only.hosts.shared, contaxonomy=../final.phylip.fn.0.030.cons.taxonomy, label=0.030, groups=Ao-Ad-Cr-Cc-Pf-Da-Wa)
0.030
Output File Names: 
ss.only.hosts.0.030.biom

The contaxonomy only goes to Otu11721, so this gives an error…

mothur > make.biom(shared=ss.only.hosts.shared, contaxonomy=../final.phylip.fn.0.050.cons.taxonomy, label=0.050, groups=Ao-Ad-Cr-Cc-Pf-Da-Wa)
0.050
[ERROR]: can't find taxonomy information for Otu11722.

The contaxonomy file starts at Otu0001, so this too gives an error…

mothur > make.biom(shared=ss.only.hosts.shared, contaxonomy=../final.phylip.fn.0.200.cons.taxonomy, label=0.200, groups=Ao-Ad-Cr-Cc-Pf-Da-Wa)
Your file does not include the label 0.20. I will use 0.200.
0.200
[ERROR]: can't find taxonomy information for Otu00001.

I find it a bit strange that it can provide a taxonomy at a lower distance, but not at higher. Does it have to do with the ordering and re-ordering of OTUs between the runs, since e.g. OtuXXXX is not the same in all contaxonomy files?

And, these are the distances I provided in the make.shared cmd.

Thanks,

Classify.otu creates OtuLabels based on the number of bins at the given distance and pads the OTUlabels with ‘0’'s. What you want to do it make the shared files labels match.

make.shared(list=final.phylip.fn.list, group=final.groups, label=0.050)
make.biom(shared=current, contaxonomy=…/final.phylip.fn.0.050.cons.taxonomy, label=0.050, groups=Ao-Ad-Cr-Cc-Pf-Da-Wa)

Hey,

Thanks for your reply! So what you are saying is that I can’t run make.shared with several labels, e.g. labels=0.03-0.05-0.20, but need to run them one at the time, i.e. creating a .shared file for each distance?

To generate .contaxonomy and .shared files I’ve tried to be consistent with the labels;

mothur > classify.otu(list=cluster/final.phylip.fn.list, name=final.names, taxonomy=final.taxonomy,label=0.030-0.050-0.200)
reftaxonomy is not required, but if given will keep the rankIDs in the summary file static.
0.030 17821
0.050 11721
0.200 2951

Output File Names: 
cluster/final.phylip.fn.0.030.cons.taxonomy
cluster/final.phylip.fn.0.050.cons.taxonomy
cluster/final.phylip.fn.0.200.cons.taxonomy
make.shared(list=ss.only.hosts.list, group=ss.only.hosts.groups, label=0.030-0.050-0.20, groups=Ao-Ad-Cr-Cc-Pf-Da-Wa)
0.030
0.050
0.200

Output File Names: 
ss.only.hosts.shared
ss.only.hosts.Ad.rabund
ss.only.hosts.Ao.rabund
ss.only.hosts.Cc.rabund
ss.only.hosts.Cr.rabund
ss.only.hosts.Da.rabund
ss.only.hosts.Pf.rabund
ss.only.hosts.Wa.rabund
ss.only.hosts.merge.groups

Thanks,
J

Because the classify.otu command creates the labels for each distance and the make.shared creates them for all distances at once you will need to run make.shared separately for each distance to make them match. Here’s an example:

At distance unique the list file has 100 OTUs.
At distance 0.03 the list file has 10 OTUS.

make.shared will create labels like:
OTU001, OTU002 …

classify.otu at unique will create:
OTU001, OTU002 …

classify.otu at 0.03 will create:
OTU01, OTU02 …

Even though OTU001 and OTU01 are referring to the same OTU in the 0.03 distance, the labels are different. Running make.shared with the label option will force the labels to match those created by classify.otu. Make sense?

Makes sense, thanks Sarah!

So, one problem which I don’t seem to be able to solve remains.

The above works for distances 0.030 and 0.200 since, when the .shared files for these are created separately, they match the starting point -amount of 0’s in the Otu count. However, for distance 0.050 this doesn’t match-up for some reason(?)

The problem is this;

classify.otu(list=cluster/final.phylip.fn.list, name=final.names, taxonomy=final.taxonomy,label=0.030-0.050-0.200)

reftaxonomy is not required, but if given will keep the rankIDs in the summary file static.
0.030 17821
0.050 11721
0.200 2951

Output File Names: 
cluster/final.phylip.fn.0.030.cons.taxonomy
cluster/final.phylip.fn.0.050.cons.taxonomy
cluster/final.phylip.fn.0.200.cons.taxonomy

mothur > system(head cluster/final.phylip.fn.0.030.cons.taxonomy)

OTU Size Taxonomy
Otu00001 72658 Bacteria(100);"Proteobacteria"(100);Alphaproteobacteria(100);unclassified(100);unclassified(100);unclassified(100);
Otu00002 47828 Bacteria(100);"Proteobacteria"(100);Gammaproteobacteria(100);unclassified(100);unclassified(100);unclassified(100);

mothur > system(head -n 3 cluster/final.phylip.fn.0.050.cons.taxonomy)

OTU Size Taxonomy
Otu00001 73628 Bacteria(100);"Proteobacteria"(100);Alphaproteobacteria(100);unclassified(100);unclassified(100);unclassified(100);
Otu00002 53404 Bacteria(100);"Proteobacteria"(100);Gammaproteobacteria(100);unclassified(100);unclassified(100);unclassified(100);

mothur > system(head -n 3 cluster/final.phylip.fn.0.200.cons.taxonomy)

OTU Size Taxonomy
Otu0001 74953 Bacteria(100);"Proteobacteria"(100);Alphaproteobacteria(100);unclassified(100);unclassified(100);unclassified(100);
Otu0002 73324 Bacteria(100);"Proteobacteria"(100);Gammaproteobacteria(100);unclassified(100);unclassified(100);unclassified(100);
ss.hosts.years.0.030.shared
label Group numOtus Otu00001 Otu00002
ss.hosts.years.0.050.shared
label Group numOtus Otu0001 Otu0002
ss.hosts.years.0.200.shared
label Group numOtus Otu0001 Otu0002

Thus the error;

mothur > make.biom(shared=ss.hosts.years.0.050.shared, contaxonomy=../final.phylip.fn.0.050.cons.taxonomy,groups=Ad_09-Ad_10-Ad_11-Ad_12-Ao_09-Ao_10-Ao_11-Ao_12-Cr_09-Cr_10-Cr_11-Cr_12-Cc_09-Cc_10-Cc_11-Cc_12-Da_09-Da_10-Da_11-Da_12-Pf_09-Pf_10-Pf_11-Pf_12-Wa_09-Wa_10-Wa_11-Wa_12, label=0.050)
0.050
[ERROR]: can't find taxonomy information for Otu0001.

Thanks,
J

Did you select groups when you ran make.shared? When an OTU contains no sequences from the selected groups, it is eliminated. This could change the number of OTUs and their labels.

Thanks, for replying!
Yes I did…

make.shared(list=ss.hosts.years.list, group=ss.hosts.years.groups, groups=Ad_09-Ad_10-Ad_11-Ad_12-Ao_09-Ao_10-Ao_11-Ao_12-Cr_09-Cr_10-Cr_11-Cr_12-Cc_09-Cc_10-Cc_11-Cc_12-Da_09-Da_10-Da_11-Da_12-Pf_09-Pf_10-Pf_11-Pf_12-Wa_09-Wa_10-Wa_11-Wa_12, label=0.050)
0.050

Output File Names: 
ss.hosts.years.shared

I’ll re-run some stuff and see if the issue remain.

Thanks,
J

Hi again,

I’ve tried to re-run things slightly different, but the above error still remains for the distance 0.050…

Any tip on what I could do would be highly appreciated.

Thanks,
J

Can you try using get.groups on the input files before make.shared and make.biom? You might also try it without the final 0, 0.05.

Hey,

I just finished running…

get.groups(group=ss.hosts.months.years.groups, list=final.phylip.fn.pick.pick.list, groups=all)
Selected 355600 sequences from your group file.

Output File names: 
ss.hosts.months.years.pick.groups
final.phylip.fn.pick.pick.pick.list

get.groups(shared=ss.hosts.months.years.0.050.shared, group=ss.hosts.months.years.groups, groups=all)
Selected 355600 sequences from your group file.

Output File names: 
ss.hosts.months.years.pick.groups
ss.hosts.months.years.0.050.0.050.pick.shared

Is the point to try to see if there are differences in the amount of OTUs between samples for the different taxonomy levels?

I ran (don’t know if it’s this I should be looking at?)…

uniq -f 1 ss.hosts.months.years.pick.groups | wc -l
252
uniq ss.hosts.months.years.0.050.0.050.pick.shared | wc -l
253

…so the shared file at 0.05 contains all the groups/samples as do the rest of the files(?). The total amount of samples is 252 (253 is due to the header).

Thanks,
J

Sorry for the trouble you are having. I will make mothur “smarter” about otulabels in the next release. Basically we need to force mothur’s command to create the same labels. To do this we need to select the groups you want to include before the classify.otu and make.biom commands. This is a good idea anyway, because if you only want to include certain samples in the shared and biom files, the consensus taxonomies should be found only using those samples. Try this:

Get.groups(list=yourListFile, name=yourNameFile, group=yourGroupFile, taxonomy=yourTaxFile)
Classify.otu(list=current, name=current, taxonomy=yourTaxonomyFile, label=0.050)
Make.shared(list=current, group=current, label=0.050)

No worries at all. Thanks for the solution, works fine!
That would be handy indeed…

Thanks again,

J