classify.seqs taxonomy file error

Alex_Thibodeau · October 24, 2013, 2:58pm

Hello!

It is probably a newby error but I have problems running classify.seqs with the greengenes database.

I am using the latest mothur Gui on Windows 7.

For reference, I am using the fasta file from greengenes second website (can’t fint it on mothur wiki…) and the taonomy file present on the mothur wiki.

Here is my output.

mothur > classify.seqs(fasta=C:\Users\Alexandre et Sophie\Documents\Programme\mothur\mothurGUI\thibodeau_pouletjanv2013.trim.fasta, group=C:\Users\Alexandre et Sophie\Documents\Programme\mothur\mothurGUI\thibodeau_pouletjanv2013.groups, reference=C:\Users\Alexandre et Sophie\Documents\Programme\mothur\Greengenes\gg_13_5.fasta, taxonomy=C:\Users\Alexandre et Sophie\Documents\Programme\mothur\Greengenes\Gg_13_5_99.taxonomy, cutoff=51)

Using 1 processors.
Generating search database… DONE.
It took 1443 seconds generate search database.

Reading in the C:\Users\Alexandre et Sophie\Documents\Programme\mothur\Greengenes\Gg_13_5_99.taxonomy taxonomy… [ERROR]: ./._gg_13_5_99.pds.tax 000644 is missing the final ‘;’, ignoring.

It does not seem to be ignoring it. I have run this command multiple times using 3 processors (I have 4) for more then 24 hrs without getting any message that the pipeline finished or seen any output files.

Everything is fine using RDP and Silva.

How can I correct this problem? I want to use the greengenes as I am analysing caecal 16s chicken sequences obtained from an Ion Torrent run.

Thanks you veru much for your precious time.

westcott · October 25, 2013, 2:34pm

Here’s a link to mothur’s GreenGenes files, http://www.wiki.mothur.org/wiki/Greengenes-formatted_databases. From the output you posted it looks like the files are not being read correctly. Could you try setting the debug flag to see what mothur is reading?

set.dir(debug=t)
classify.seqs(…)

Alex_Thibodeau · October 28, 2013, 12:41pm

Thanks you for your answer!

Alright, I will try the debug today!

I am already using the taxonomy file that you are suggesting.

On the page, there seems to be 3 files for download:

the taxonomy file for classify : greengenes reference taxonomy -

the alignement file for chimera: greengenes gold alignment

the referene alignment for alignment: greengenes reference alignment -

Where is the .fasta file for classify? Sorry,seems that i cannot find it on the page! It’s probably me but I really cannot find it!

I,ll be back at the end of the day with the debug results.

westcott · October 28, 2013, 3:27pm

If you download and unzip the Gg_13_5_99.taxonomy folder from the “greengenes reference taxonomy” link, it contains 4 files: gg_13_5_99.fasta, gg_13_5_99.gg.tax, gg_13_5_99.pds.tax and pds.notes. You can run: classify.seqs(fasta=yourSequences, reference=gg_13_5_99.fasta, taxonomy=gg_13_5_99.gg.tax).

Alex_Thibodeau · October 28, 2013, 8:26pm

Hello!

I will run the debug tonight

I feell really stupid now but with the taxonomy file (in the curent stuff section) when I decompress the Gg_13_5_99.taxonomy.tar file I only get 1 file which is Gg_13_5_99.taxonomy and if I open it in Notepad I only see the taxonomy, nothing else. So weird!

I have a printscreen to prove it!

Sorry again, this is so puzzeling!

Alex_Thibodeau · October 28, 2013, 8:30pm

When I look at the link using propertiess I get the followin URL

http://www.wiki.mothur.org/w/images/9/9d/Gg_13_5_99.taxonomy.tgz

but the file I get is Gg_13_5_99.taxonomy.tar

Is there a difference between the .tgz and .tar file?

westcott · October 29, 2013, 12:17pm

Okay, I think I see what’s going on. Your machine likely decompressed the tgz to a tar, but did not complete the decompression all the way to the GreenGenes Folder. Can you try double clicking on the .tar file to see if your machine will decompress the file for you? If not, you will have to open a command prompt and use tar to decompress the tar file into the folder I was describing above.

Alex_Thibodeau · October 30, 2013, 1:11pm

Morning!

I was able to get the files! The .taxonomy file that I received (decompressins of the .tar file I got from the wiki) was indeed an archive file not detected by Windows. By opening it further with Winrar, I was able to find the files!

They were well hidden.

I will try to run Mothur with these files now and see if I stil get the bug!

I will comeback to you as soon as it is finished!

Alex_Thibodeau · October 30, 2013, 1:44pm

It is classifying!

Many thanks for the support!

Have a nice day!

Kate_Randall1986 · February 13, 2015, 2:07pm

Hello,
I am new to Mothur and I am attempting to analyse my samples for the first time.
I have followed all the steps as I can see from my log file accordingly. I am now at the stage where I am trying to classify my sequences using the following;

classify.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.count_table, reference=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax, cutoff=80)

Unfortunately after the run, I get an ERROR message appear at the end of the output and no wang file is generated. The ERROR message is as follows;
[ERROR]: HWI-M02748_22_000000000-AAEY5_1_2112_8851_4245 is already in your taxonomy file, names must be unique

I deleted all the trainset files and downloaded the two I need from the site again and re-ran the classify.seqs command but I still get the same ERROR message as the output and no wang file to proceed onto the next stage.

Any help you could give would be greatly appreciated, or if you require more information please let me know.

Best wishes,
Kate

westcott · February 18, 2015, 2:31pm

Can you try redownloading the training set from http://www.wiki.mothur.org/wiki/454_SOP? Also, make sure to remove the temporary files mothur makes. These should be called:

trainset9_032012.pds.8mer
trainset9_032012.pds.trainset9_032012.pds.8mer.numNonZero
trainset9_032012.pds.trainset9_032012.pds.8mer.prob
trainset9_032012.pds.tree.sum
trainset9_032012.pds.tree.train

Are you using the latest version of mothur, 1.34?

Topic		Replies	Views
classify.seqs taxonomy database ERROR Commands in mothur	1	2085	February 18, 2015
classify.seqs with greenegenes tax and fasta gives 100% unclassified Commands in mothur	1	1151	November 6, 2015
classify.seqs error in mothur-1.22 mothur bugs	1	3979	November 7, 2011
Classify.seqs error ugreen db for 23S mothur bugs	5	355	July 23, 2023
taxonomy mothur bugs	13	16661	July 21, 2010

classify.seqs taxonomy file error

Related topics