Custom Database Help

russejenn · January 29, 2021, 4:10pm

Hello,

I posted 17 days ago to get help with creating a custom database, and I didn’t get a response, so I thought I would break it down. Could someone give me an example of what the align file and the taxonomy files are supposed to look like after running the following code from the README (Before the R code).(README for the SILVA v138 reference files)

#generate alignment file
mv silva.full_v138.good.pcr.pick.fasta silva.nr_v138.align

#generate taxonomy file
grep ‘>’ silva.nr_v138.align | cut -f1,3 | cut -f2 -d’>’ > silva.nr_v138.full

There are no examples for any of these files, so I don’t know where I am going wrong, but I don’t think either of these files look correct after following the steps provided. Thanks.

For context, I am trying to create a LSU Silva database for 23S alignment.

pschloss · February 2, 2021, 2:30pm

Hi,

After the mv step, you will have a new file called silva.nr_v138.align that has the same contents as what was in silva.full_v138.good.pcr.pick.fasta.

After the grep step, you will have a file that contains two columns - one with the sequence name and a second with the taxonomy string for that sequence.

If you’re getting an error message, could you post it (or part of it)?

Pat

russejenn · February 3, 2021, 4:18pm

Hello!

I am not getting error messages, but I’ve never successfully gone through the MiSeq SOP with the files made, and so I suspect something is wrong.

My .align file looks like this (it’s a whole role of dots for all, and I don’t know if that’s right):

AB003380.FibSuc60 100 Bacteria;Fibrobacterota;Fibrobacteria;Fibrobacterales;Fibrobacteraceae;Fibrobacter;
…

My .tax file ends up looking like this after the R script. This file definitely seems like it’s being made incorrectly somewhere, because I don’t think it’s supposed to have 'NA’s.
AB003380.FibSuc60 NA
CP017688.FlaCra15 NA
LS483298.Str11364 NA

Thank you!

pschloss · February 4, 2021, 9:54pm

Hi there,

I wonder if your export file from silva has a problem. I don’t think that “100” should be there. Also, you know that we provide the output of this pipeline at Silva reference files - right?

Pat

Topic		Replies	Views
Creating a 23S Database	0	554	January 11, 2021
Tweaking databases to include custom sequences Commands in mothur	14	13000	May 28, 2016
Creating a customized reference alignment for V1-V2 Commands in mothur	2	916	January 19, 2020
Silva custom database Commands in mothur	1	3323	January 16, 2013
Using silva as reference database in MiSeq SOP	6	2285	December 8, 2020

Custom Database Help

Related topics