Database in tutorial vs silva 138.1

Hi,
I’m new to Mothur.
I’ve finished Miseq SOP tutorial based on mothur.org
Because of curiosity, I ran the analysis using tutorial fastq files with Silva 138.1 (mothur-compatible).

Compared to tutorial, it gave me different results. Of course I expected it, but I have few concerns.

Those are my questions.

(1) In tutorial, silva aligned database was used for align.seqs and RDP database was used for classify.seqs. Why? Why different database in single analysis?

(2) Hence, I re-analysed tutorial files using Silva 138.1 for both align.seqs and classify.seqs. It worked perfectly. But, gave me different results. I understand differences but it was too different. Was this coming from different database for classify.seqs between tutorial and re-analysed myself? I just wonder if I analyze my own data, should I use silva 138.1 for align.seqs and RDP 18 for classify.seqs? or It is OK to use silva 138.1 for both align.seqs and classify.seqs?

Below are two results. For example, (1) the most abundant phylum is Bacteroidetes in tutorial, but Bacteroidota in reanalyzed. (2) TM7 and Tenericutes are founded in tutorial version, but none in silva 138.1. (3) Patescibacteria is founded in silva 138.1 version, but not in tutorial version. Which one should I trust?


[this is used tutorial database (silva for align.seqs and RDP for classify.seqs]


[ this is used whole database (silva 138.1 for align.seqs and classify.seqs]

1 Like

To some degree the choice of classification database is a matter of personal preference. The two classification databases have different sequences and even the same sequence could have different names in the two databases. RDP is closely tied to Bergey’s Manual of official bacterial taxonomy whereas SILVA does more to extract taxa names from a tree-based approach. My general recommendation is that unless there’s another reason, pick the database that gives the fewest unclassified sequences and go from there

Pat

1 Like

Great! Now I understand!

Thanks everytime!

Jun

To add to this: use controls!!

For my gut microbiota we now uses Silva 138. the latest iteration of RDP gave awfull results on our positive control, had to lower to 70 bootstrapping value to get the composition about right, which is awfull.

Hello Alexandre,

Positive control means Mock community?

Best,
Jun

Yep!

Télécharger Outlook pour Android

From where we can collect these different databases in fasta format except databases files are present in mothur?

Morning. Sorry, I do not get your question.
Mothur-formatted database are present on the github under references. As for positive controls, depends if you go with a commercial solution or homemade solution but in any case you can check the material and methods of relevant articles to see what other uses.

Best of sucess

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.