Understanding my results!!!!

RPksu · February 7, 2014, 9:26pm

Hi everyone, I am pretty new in this sequencing/mothur world so I am sorry if my question sounds to basic. What I am trying to understand here is the difference between make.shared files using label=unique and label=0.03. I thought I was getting more OTUs using 0.03 than UNIQUE reads but seems is the opposite, right?

Can I just pick the label that gives me more OTUs? Does it really matter if I use UNIQUE or 0.01 or 0.03? I am quite confused with those numbers!!!

Thanks in advance

pschloss · February 10, 2014, 5:24pm

It all matters label=0.03 is the same as a 97% similarity cutoff. label=unique means that to be in an OTU everything has to be identical, which I doubt is what you want. You’ll have more OTUs at label=unique than you will at label=0.03.

RPksu · February 11, 2014, 8:14pm

Thanks Dr. Schloss.

RPksu · February 11, 2014, 8:34pm

Another question:

When I trim my reads using keepfirst=300 I always get some weird results and never get a good blast regarding my fungal community. Then, I tried to trim my reads using 250 and 200 instead and the results were much better. My best result was using 200 actually and my diversity was much higher using that. So, why am I not getting results with 300 but with 250 or 200 :? ? The mean length of my reads are 360bp. Am I missing something? Is that normal?

Here is the summary from my data:
mothur > summary.seqs(fasta=SEED.fasta, processors=2)

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 247 247 0 3 1
2.5%-tile: 1 317 317 0 4 313646
25%-tile: 1 359 359 0 4 3136454
Median: 1 359 359 0 4 6272908
75%-tile: 1 359 359 0 4 9409361
97.5%-tile: 1 430 430 2 13 12232169
Maximum: 1 501 500 67 250 12545814
Mean: 1 360.094 360.093 0.283635 4.5508

of Seqs: 12545814

Here is what I used for trimming:

trim.seqs(fasta=SEED.fasta, oligos=SEED.oligos, maxambig=0, maxhomop=8,flip=T, bdiffs=0, pdiffs=2, minlength=250, maxlength=400, keepfirst=250,processors=2)

Thanks!!!

westcott · February 24, 2014, 2:46pm

The keepfirst parameter in trim.seqs removes all bases after the number you set keepfirst to. My best guess would be the fragment of the sequences between 200 and the end of the sequence contain more errors. These errors are preventing you from getting a good match with blast and good results in your downstream analysis. Removing that section of the sequence improved your results. Do you have quality data for these sequences? If you do, you could run something like Pat recommends in the SOP:

trim.seqs(fasta=SEED.fasta, oligos=SEED.oligos, qfile=SEED.qual, maxambig=0, maxhomop=8, flip=T, bdiffs=0, pdiffs=2, qwindowaverage=35, qwindowsize=50, processors=2)

Topic		Replies	Views
Classify OTU Labels Theory behind mothur	4	1837	March 15, 2017
Your file does not include the label 0.03. I will use unique. Commands in mothur	1	940	March 2, 2017
make.shared only dishes out unique otus Commands in mothur	3	2595	September 25, 2014
cluster.split Commands in mothur	3	1093	March 15, 2018
cluster.split/ make.shared -> labels? Commands in mothur	2	1259	January 18, 2016

Understanding my results!!!!

of Seqs: 12545814

Related topics