Hello, are the mothur-formated version of the silva.bacteria.fasta (used for alignment) and the RDP (used for classifying) up to date? or are the user’s are expected to update theses whenever an update is available. if not up to date, can anyone please provide instructions on how to do that? recommeded software-program? and what are the set of options used ? type of fasta,.?
I hava a functional version of ARB running on my Mac Mavericks.
many thanks
O.
The RDP database is up to date.
The SILVA database is up to date. SILVA does not release their SEED database and so we have to recreate it. When we’ve done this in the past, we find that it doesn’t change much (at all) from release to release.
Pat
Thank-you Pat. I visited the RDP site and saw that the latest release is 11. The RDP training set says v9, so i thought the numbering might be related. this is great.
i have two little questions regarding the topic that i couldn’t figure out on the wiki:
- what is the dis-/advantage of using the Silva for taxonomy vs. the RDP (bacteria only)?
- do you recommend exploring multiple database?
- how can i get the name of the database to show up in the file name i.e. xxx.RDP.xx? do i rename the file name manually?
many-thanks
O.
Website version is #11, the training set is v. 9.
- what is the dis-/advantage of using the Silva for taxonomy vs. the RDP (bacteria only)?
SILVA has more taxa in it, but RDP follows Bergey’s outline. RDP is also based on typical Linnean system. I would probably go for greengenes over SILVA for this reason.
- do you recommend exploring multiple database?
I guess it depends on your question. I think the “names” of bacteria are highly over rated and so I don’t spend a lot of time worrying about it. If something comes back as Bacillus, what does that tell you?
- how can i get the name of the database to show up in the file name i.e. xxx.RDP.xx? do i rename the file name manually?
I think it comes from the reference filename. So when you run classify.seqs with trainset_v9 that tag will be inserted. But, yeah, you can always just do it yourself.
Pat
ok, thanks.
re:renaming files or adding to the name: It can alway be done manually but it will be great if the adding-to-the-file-name can be done consistently across other commands as well. i.e. is there such a thing where you can say “add=xxx” ? and it’ll show up in the same spot every time without affecting how mothur names her files.
This can be very handy.
I ran this for the classification but not sure why the trainset did not show in my file name
classify.seqs(fasta=dfm.shhh.trim.unique.good.filter.unique.precluster.pick.fasta, name=dfm.shhh.trim.unique.good.filter.unique.precluster.pick.names, group=dfm.shhh.good.pick.groups, template=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax, cutoff=80, processors=16)
#Output File Names:
#dfm.shhh.trim.unique.good.filter.unique.precluster.pick.pds.wang.taxonomy
#dfm.shhh.trim.unique.good.filter.unique.precluster.pick.pds.wang.tax.summary
many thanks
O.
the “pds” in the output name indicates the database you used - it’s the pds version of trainset_9