Silva database update

mhadidi · January 17, 2013, 3:36pm

Hello,

I would like to use mothur on the new silva database, which could be downloaded from this link

Is there any mothur compatible version of this database?

Another question, for the bacterial sequences here,

where can I find the sequences which are unaligned?

Best Regards,
Hadidi

dwaite · January 17, 2013, 8:01pm

It’s fairly trivial to create the mothur taxonomy/template files using the ARB export function. If you mark your sequences in ARB you can export them as fasta directly, and then create an export template like:

SUFFIX tax
BEGIN
*(name) *(tax_slv);

to make the taxonomy file.

There are a few little things you need to clean up afterwards - removing spaces from the taxonomies, removing any double ;; in the taxonomies, and spaces/gaps in the fasta file. That said, it’s worthwhile cleaning the database up first, since there are often suspected chimeras, incomplete taxonomies, and many redundant sequences in the ARB databases.

I can’t remember exactly where to download the unaligned sequences, but you could just run degap.seqs on silva.bacteria.fasta and you’d have the same result.

mhadidi · January 23, 2013, 11:21am

Thanks Dwaite for your help.
Could you please tell me a reference for “That said, it’s worthwhile cleaning the database up first, since there are often suspected chimeras, incomplete taxonomies, and many redundant sequences in the ARB databases.” My supervisor asked me about it.

Thanks,

dwaite · January 23, 2013, 10:56pm

I don’t know of any reference, these are just things we’ve observed working while working with ARB. I realise that you’re working with the NR database, so this may not hold true for you.

>Suspected chimeras
From the SILVA 108 SSU_Ref database (this is the last one I worked with, so had it on my computer) I marked and exported sequences with a pintail score of less than 75. There were 73,211 of these from about 600,000 sequences total. I aligned them in mothur, where 15,604 couldn’t be aligned at all, and of those that aligned 8,848 were flagged as chimeras.

This is always a bit tricky, because results can change depending on your chimera detection algorithm and reference database, but there are definitely sequences in the database that I would be suspicious of.

>Incomplete taxonomies
ARB databases have 3 fields for taxonomy - SILVA, RDP and Greengenes (tax_slv, tax_rdp, and tax_gg, if I remember correctly). Because the databases uses SILVA sequences, they don’t always have an equivalent taxonomy in RDP/Greengenes. Basically, the tax_slv field is always full, but sometimes the other two will be cut short at some level.

>Redundant sequences
You’re using the NR version, so don’t worry about this.

kaiyara · July 5, 2013, 11:04am

It’s in our PLoS ONE pipelne paper, the Haas chimeraslayer paper, and the current version is available through BEI

eliztr · February 12, 2014, 2:42am

How do you create an export template in ARB? I only know how to export the NDS info and a fasta file.

Thanks!

dwaite · February 12, 2014, 9:14pm

If you’re using a the linux version the templates are stored in /usr/lib/arb/export. They’re just a simple text file with the extension ‘eft’. The syntax is pretty simple, so you can generally work out how to make new templates from reading the existing ones.

[quote="mhadidi"] Thanks Dwaite for your help. Could you please tell me a reference for "That said, it's worthwhile cleaning the database up first, since there are often suspected chimeras, incomplete taxonomies, and many redundant sequences in the ARB databases." My supervisor asked me about it.

Thanks,
[/quote]
Also, I know this is old now, but I was reading the SILVA documentation recently and realised that they deliberately leave chimeras in the database, and leave it to the researcher to decided whether to use them or not.

Topic		Replies	Views
Error in bacterial SILVA taxonomy files mothur bugs	2	4851	November 14, 2012
silva database Commands in mothur	6	8146	February 5, 2011
Using new SILVA taxonomy file with classify.seqs Commands in mothur	14	17567	July 31, 2014
taxonomy outline Integrating mothur with other programs	0	7019	April 23, 2014
Finding .taxonomy files for Classify.seqs? Commands in mothur	10	45915	September 4, 2013

Silva database update

Related topics