OTU based analysis and general Mothur outputs

Hello,

I have a few questions about the general Mothur outputs…a bit of background:
I ran my sff file (order=B) followed the SOP tutorial and ran the OTUs based analysis…

  1. I’m a bit confused with the total number of seqs that Mothur uses for the OTU analysis.I had about 8000 uniq seqs but all the OTUs outputs are based on 3,800 why is this? Is it because these are the only Otus within the 0.03 cutoff?

mothur > summary.seqs(fasta=/home/zm1/Mothur.cen/mothur/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pick.fasta, name=/home/zm1/Mothur.cen/mothur/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pick.names)


Using 1 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 496 258 0 3 1
2.5%-tile: 1 497 266 0 4 1274
25%-tile: 1 497 266 0 4 12735
Median: 1 497 266 0 5 25469
75%-tile: 1 497 266 0 5 38203
97.5%-tile: 1 497 267 0 5 49663
Maximum: 1 497 282 0 8 50936
Mean: 1 497 266.032 0 4.54714

of unique seqs: 8466

total # of seqs: 50936

Output File Names:
/home/zm1/Mothur.cen/mothur/012414MCcom519F.shhh.trim.unique.good.filter.unique.precluster.pick.pick.summary

  1. when I build the heatmap to compare the across samples instead of getting 10 samples 1 can only see 9, why is this?

  2. about classification output:

-The final classification is only to genus level, can I get species too? how?

  • if I want to get the Otus by different taxonomic level what should I do?
    -In the final table I still get OTUs that are unclassified at the kingdom level, but I thought that Mothur had gotten rid of all seqs that it was not able to classify to kingdom level…I have about 3,00 still?
  • I tried to use one of the final sequences and classified it with RDP but RDP says that the seq is too short to be classified…why is this?
  1. I have a significantly different output of Otus when I use Mothur in comparison to the output of my sequencer company…In general, both should look alike bit they don’t really…any suggestions?.do you think that running sff with order:A or B changes the output significantly?
    or is it the alignment? for example I get sample 1: 6800 seqs from sequencer company vs 2100 in Mothur, sample 2: 6371 seqs vs 8557 in Mohtur…

  2. I tried to use the fasta file created after chimera cleaning from my sequencer to build the dist matrix and run the otu analysis in Mothur but Mothur replied that my seqs had different length. Does this means that I should run the sequencing processing from the beginning to be able to do the analysis…

mothur > dist.seqs(fasta=/home/zm1/Mothur.cen/mothur/012414MCcom519F-pr.fasta.tmp2c.fas, cutoff=0.15)

Using 1 processors.
[ERROR]: your sequences are not the same length, aborting.
[WARNING]: your sequence names contained ‘:’. I changed them to ‘_’ to avoid problems in your downstream analysis.

Thank you!!!

  1. I’m a bit confused with the total number of seqs that Mothur uses for the OTU analysis.I had about 8000 uniq seqs but all the OTUs outputs are based on 3,800 why is this? Is it because these are the only Otus within the 0.03 cutoff?

Is this with subsampling? From the output you sent I’m not seeing where there are 3800 sequences

  1. when I build the heatmap to compare the across samples instead of getting 10 samples 1 can only see 9, why is this?

What version of mothur are you using and what command are you running? If you run count.groups with your shared file, how many groups are there?

  1. about classification output:
    -The final classification is only to genus level, can I get species too? how?

You would have to significantly enhance your reference taxonomy to get species-level information.

  • if I want to get the Otus by different taxonomic level what should I do?

Use the phylotype command

-In the final table I still get OTUs that are unclassified at the kingdom level, but I thought that Mothur had gotten rid of all seqs that it was not able to classify to kingdom level…I have about 3,00 still?

Not knowing what commands you are running or what you’re looking at in the output it’s hard to say.

  • I tried to use one of the final sequences and classified it with RDP but RDP says that the seq is too short to be classified…why is this?

Hmmm. How long is the sequence? I’m not sure what the RDP website is doing - we’re not connected.

  1. I have a significantly different output of Otus when I use Mothur in comparison to the output of my sequencer company…In general, both should look alike bit they don’t really…any suggestions?.do you think that running sff with order:A or B changes the output significantly?

They’re likely using a different protocol. Ours is the most robust protocol in the literature at reducing sequencing errors, chimeras, and getting solid OTUs. Every variable will have an effect. Very few people perform robust benchmarking experiments. We have.

  1. I tried to use the fasta file created after chimera cleaning from my sequencer to build the dist matrix and run the otu analysis in Mothur but Mothur replied that my seqs had different length. Does this means that I should run the sequencing processing from the beginning to be able to do the analysis…

Your sequences need to be aligned to run dist.seqs. It sounds like you need to get the sff file and barcodes from your sequence provider and start at sffinfo.