I have an issue I would like to know your thoughts on. I finished the fungal analysis following mothur SOP. The sequences are from the ITS2 region and I used the UNITE database as a reference database.
The problem is that one of the large size OTUs has come as k_unclassified fungi. I’ve got the representative sequence for this particular otu but it has no matches or similarities in UNITE and other databases.
I’m mainly looking for yeasts, and not sure if UNITE database has a good coverage for yeasts?. I know it covers the entire ITS2 region.
Thanks for your reply.
Yes I tried plating the rep sequences using the UNITE, ITS NCBI, and NCBI RefSeq databases. No hits or similarities were found! Very frustrating.
That makes me wonder if you have a chimera? though they shouldn’t be as common in ITS2. Have you looked at the lit at studies of the same type of sample, does anyone else mention a large unidentified chunk of data?
Hi Faisal, welcome to the forum
I would typically suggest the same, which is try with different databases
However, before you do that I d suggest to retrace your steps during the tutorial to make sure that the sequence you see is not an artifact of something that went wrong, e.g. chimeric, part of sequence with lots of errors etc etc
How long is your sequence?
I should have mentioned that I have 70 different samples from different sources and that the large unclassified OTU was from samples belonging to a specific group. So I do not expect that something was wrong with the steps as other samples from different groups have clear and expected results.
The sequence is 257 bp
Can you please suggest a good database other than UNITE or NCBI databases?
Here is the representative sequence of the unclassified OTU
I did a quick blast on NCBI against the nr and the sequence gives a pretty good hit against a leuconostoc phage. So dont get frustrated: these are not the ITS you are looking for! And dont worry, we ll figure it out
Take a look
OK, then it looks like that these phages did come from your samples. I googled Leuconostoc and it seems that they are associated with milk, cheese etc
Anyway, that being said, I would align the primers you used to amplify the ITS library and the Leuconostoc phage sequences to see if they indeed can amplify the phages as well. My guess is that they did but just to confirm.
Then if indeed what happened is that your ITS primer accidentally amplified phages as well, I would suggest to try and remove these phage sequences and then work with the rest of your data…
I just did exactly the same as @sapou. I agree with their assesment and suggestion to remove those seqs. I haven’t seen this but as it happens I’m running the first batch of ITS seqs for my users who study cheese. I’ll certainly let them know what you’ve found.
Hi. I am actually having the same problem! Mine are soil samples and some of them have up to 94% unclassified_Fungi. I will try blasting some of my unknowns to see if there is another organism or virus that I am picking up.
My question is, because mothur does not cluster OTUs for ITS sequences, is the rep OTU sequence accurate for all unknown sequences? Does it lump all unknowns into a single bin when phylotyping? Is there another way to create OTUs with ITS sequences in mothur other than phylotyping? I ran my analyses in parallel using qiime2 and found several OTUs that were classified as unknown using their de novo clustering command. Whereas my mothur output gave me a single “unclassified” group.
We have actually decided to remove the samples associated with this unclassified fungi or the phage. The ITS primer did amplify the phage sequence as the alignment showed. that’s disappointing!