OUT classification

laszlo · August 2, 2016, 2:41am

Hi,

I am analysing an environmental sample focusing on one particular species. Before DNA extraction, I identified this species by culturing method to confirm the presence of this species in the sample. I used Illumina platform for sequencing the ITS1 region then I used the mothur pipeline with the most recent UNITE database to identify the OTUs in the sample. More than 1000 OTUs was identified but not the species in interest. I repeated the experience again assuming that something went wrong either during the DNA extraction or sequencing. But again I got the same result. Then I inserted the ITS1 sequences of the species in interest into the fastq files to see if there is anything to do with the data processing. And to my surprise, the species remained undetected again. If I blast this sequence into the UNITE database, it comes up as a first hit. So it should be identified from the sample after the data processing.

I would appreciate a lot your comments.

Thanks a lot.

Kind regards.

Laszlo

Kendra · August 2, 2016, 2:31pm

Have you tried qPCR on your original sample to see how abundant it is? The other thing to check is how well the primers hit it, do in silico PCR on your target organism and see if there are many mismatches.

OH and since it’s ITS. How long is ITS2 for your target? Both 454 and Illumina sequencing is biased towards shorter sequences (454 was really biased, not sure how biased Illumina is)

laszlo · August 3, 2016, 7:02am

Hi,

Thanks for your reply. I agree that there could b e a problem with the DNA extraction amplification, sequencing, etc… but my concern is more with the data processing. As I mentioned I inserted the ITS sequences in abundance into a “created” fastq files before data processing but they were still not identified. So even if they are present in the sample they remain undetected for some reason.
Any idea what can results this?

Thank you so much.

Regards.

Laszlo

Kendra · August 3, 2016, 1:53pm

look at the number of bases different between the inserted sequence, the representative sequence in the database, and the sequence that the inserted sequence is hitting. If they are all very similar, you may not have enough resolution to differentiate the species that you want from the species that is in the database.

laszlo · August 12, 2016, 1:50am

Hi,

I have been using artificial data. The clustering goes well which means all the sequences I am interested are clustered together into one OTU. However, I am having problem classifying them. I used the distance and blast method too but my sequence is wrongly identified. It is a closely related species which comes up. When I check the distance matrix, the K2P distance between the OTU and the identified sequence is 0.066 which should be enough for the separation. Moreover if I Blast the OUT into any database it comes up as the species I want. So theoretically the OTU should be identified correctly. Do you have any suggestion how to refine or adjust the classify.seq function to get the correct ID?

Thanks a lot.

Laszlo

Topic		Replies	Views
Unclassified fungi (Fungal analysis ITS2 region) Theory behind mothur	15	2168	June 29, 2020
80 samples help	10	196	May 28, 2024
Pulling out lineages and assigning OTUs Theory behind mothur	2	3190	January 22, 2014
Processing 454, Illumina, and Sanger Data Commands in mothur	1	3094	April 24, 2012
Many OTUs classify into the same species?? Commands in mothur	2	1823	February 9, 2016

OUT classification

Related topics