mothur

Picrust analysis using mothur output file

Hello,
I was wondering to know if there is any way that we can do picrust analysis using mothur output file?
Since picrust uses only greengene database, but in my case I used silva database to assign taxonomy. Can we run picrust using silva database?

Thank you so much in advance.

Hi Pratima,

You’ll have to generate a new set of classifications using the greengenes taxonomy. Once you have that, you’ll want to follow the process I’ve outlined here. Alternatively, Picrust2 doesn’t require a taxonomy file and can be run using the files generated running pre.cluster.

Pat

Hi Pat,

I had been looking forward to piping mothur output for PICRUST. Personally, I feel not comfortable to use Silva reference for otu cluster and classification and Greenene for PICRUST. I used to do this function prediction using Tax4Fun (R based software). But I am really curious how the results different from each other. Good to know that Picrust2 take precluter output file as the input.

cheers,
Fang

Thank you so much Pat for your response.
I am planning to rerun the analysis using greengenes database. I had doubdt in the step where we use pcr.seqs command using silva.bacteria.fasta
“mothur > pcr.seqs(fasta=silva.bacteria.fasta, start=11894, end=25319, keepdots=F, processors=8)” according to Miseq SOP.
What should we do if we are using greengene database? Are we supposed to use the same pcr.seqs command using the silva.bacteria. fasta?

Thank you.
Pratima

You would need to find your own coordinates. We provide a tutorial on how to do this at http://blog.mothur.org/2016/07/07/Customization-for-your-region/. However, that is really meant for customizing the alignment reference and I can’t think of a reason anyone should use the greengenes reference alignment. For 16S rRNA genes, you should always use the SILVA reference; however, for classification you can certainly use greengenes, but I’m not sure how important it is to customize the database to a specific region.

I found my own coordinates by aligning my sequences to the bacteria.silva.fasta. I customized my sequences since we were amplifying v4-v5 region. By saying classification using greengenes, you mean to say use greengene database in reference while running the comand below?
mothur > classify.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.vsearch.pick.count_table, reference=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax, cutoff=80).

Thank you so much for your cooperation.

You don’t need the coordinates to run classify.seqs. Also, if you use the latest version of picrust, you don’t need classified data at all.