Dear mothur enthusiasts,
A colleague advised me to try and run picrust on my dataset in order to get some functional inference from 16S sequences. He told me to “classify the OTU on GreenGenes” and prepare a biom file.
I have tried to follow the commands listed in an earlier forum post (Mothur file to biom), continuing from the fasta and count_table from the “classical” analysis (mothur 1.48 under Windows 10 Pro):
classify.seqs(fasta=16Sn_final.fasta, count=16Sn_final.count_table, reference=gg_13_5_99.fasta, taxonomy=gg_13_5_99.gg.tax, cutoff=80)
It took 5 secs to create the summary file for 47073 sequences.
[…] [WARNING]: M02352_43_000000000-J98TB_1_2116_20022_7314 could not be classified. You can use the remove.lineage command with taxon=unknown; to remove such sequences. […]
Output File Names:
16Sn_final.gg.wang.taxonomy
16Sn_final.gg.wang.tax.summary
remove.lineage(fasta=16Sn_final.fasta, count=16Sn_final.count_table, taxonomy=16Sn_final.gg.wang.taxonomy, taxon=unknown)
Removed 3 sequences from 16Sn_final.fasta.
Removed 4 sequences from 16Sn_final.count_table.
Output File Names:
16Sn_final.pick.fasta
16Sn_final.pick.count_table
/******************************************/
Output File Names:
16Sn_final.gg.wang.pick.taxonomy
16Sn_final.gg.wang.accnos
16Sn_final.pick.count_table
16Sn_final.pick.fasta
dist.seqs(fasta=16Sn_final.pick.fasta, cutoff=0.03)
Output File Names:
16Sn_final.pick.dist
cluster(column=16Sn_final.pick.dist, count=16Sn_final.pick.count_table)
Output File Names:
16Sn_final.pick.opti_mcc.list
16Sn_final.pick.opti_mcc.steps
16Sn_final.pick.opti_mcc.sensspec
make.shared(list=16Sn_final.pick.opti_mcc.list, count=16Sn_final.pick.count_table)
Output File Names:
16Sn_final.pick.opti_mcc.shared
classify.otu(list=16Sn_final.pick.opti_mcc.list, count=16Sn_final.pick.count_table, taxonomy=16Sn_final.gg.wang.pick.taxonomy)
Output File Names:
16Sn_final.pick.opti_mcc.0.03.cons.taxonomy
16Sn_final.pick.opti_mcc.0.03.cons.tax.summary
make.biom(shared=16Sn_final.pick.opti_mcc.shared, constaxonomy=16Sn_final.pick.opti_mcc.0.03.cons.taxonomy, reftaxonomy=gg_13_5_99.gg.tax, picrust=97_otu_map.txt, label=0.03)
Output File Names:
16Sn_final.pick.opti_mcc.0.03.biom
16Sn_final.pick.opti_mcc.0.03.biom_shared
The forum post mentioned above lists more steps to be performed on the biom file but I have no idea how to do this:
biom validate-table -i crc.0.03.biom…
So, I have uploaded both files from make.biom(…) onto the Huttenhower lab GALAXY instance (and the files appear in green) but it fails at the first step, Normalize By Copy Number.
Dataset generation errors
Dataset 4: Normalize By Copy Number on data 3
Tool execution generated the following error message:
Traceback (most recent call last):
File “/galaxy-central/tools/picrust/scripts/normalize_by_copy_number.py”, line 146, in
main()
File “/galaxy-central/tools/picrust/scripts/normalize_by_copy_number.py”, line 72, in main
otu_table = parse_biom_table(open(opts.input_otu_fp,‘U’))
File “/galaxy_venv/local/lib/python2.7/site-packages/biom/parse.py”, line 323, in parse_biom_table
t = parse_biom_table_str(table_str, constructor=constructor)
File “/galaxy_venv/local/lib/python2.7/site-packages/biom/parse.py”, line 619, in parse_biom_table_str
raise BiomParseException, ‘Unknown table type’
biom.exception.BiomParseException: Unknown table type
Is this due to the fact that I have not performed the additional steps with biom validate…? If they are necessary, how can I do this? Unfortunately, I have visited the biom site on github and could not find any installer.
Thank you for any help!
Yours,
Maxime