taxa levels

Hello again

I am trying to analyze my data (MiSeq) to the genus or family level or even higher, such as to see if there is a difference between Firmicutes and Bacteroidetes instance or look at if there is a difference between group for Lachnospiraceae.

I am not quite sure of what to use. For now I can use the results from classify.seqs and use the relative abundance data (LEFSE on Galaxy), but it would be better if I could do an OTU analysis (LEFSE) as written in the MiSeq SOP.

i cannot figure this one out.

Thanks in advance.

Why not build a shared file at the phylum level and do the analysis on that?



Thanks for the idea.

I also though of getting the OTU from the subsample.shared file, reclassify, and use the summary file for input in the Galaxy server.

I tried to build the shared file at a different level, if I am correct, using a higher label, but it is not working, I cannot get anything more then the 0.03.
I have not tried, yet, to use a higher cutoff (0.25) for the prior cluster.split step.

But having it already implemented in the LEFSE command would save a couple of steps. But again, it is just computer time and while the computer run, the mouses can play (or read, or write, or take some cofee, or on some occasion sleep!)

I just though of somethings else

1)instead of using sub.sample on the shared file to create the subsample.shared that is used downstream for LEFE.

I would create a new folder called subsample and output there not to mix things…

Use sub.sample with the list and count file that you normally would use in the make.shared command.

you can use this new list and new count for creating a shared file that actually will be a subsample. just rename it subsample.shared to continue with the flow if needs be for example to run LEFSE at the 0.03 cutoff.

you can use these same new list and count for classify.otu and use the .summary, reformat to what you need, for use in LEFSE on the galaxy web to run LEFSE on different taxa levels.

I am crazy?

and I just forgotten about the command.

So I guess you can just use sub.sample on your list, count and tax file.

then run using the sub sampled tax file and subsample count file, using the relative abundance and use the summary for input in the LEFSE on the galaxy web server.
If you need the downstream subsample.shared file, you can still create it with the subsample list and subsample count file.

I will cluster today. Should have some answer by next week (shared computer…) Had to restart from scratch because we change the “story” we are going to tell in the paper and also wanted to have everything done on the same Mothur version so it is taking a long time.


and I am am crazy for doing so, please advise!

Like I said, I’d work with the original shared file at the phylum level… It seems to me like you are doing a lot of gymnastics for nothing.


The only problem is when I use the get.label on the list file, I only see the 0.03 label, so I need a workaround.

Anyway, I tried to use sub.sample using the taxonomy, count and list file. I then use count.groups and realized that it did not subsampled each group down to the size I wanted to for each group but subsampled my total number of otu to the size I inputed. So it gave me around 300 OTU per group instead of the actual number I wanted!

So using the subsample in this way was not a good idea anyway!

but it is working if you use persample=true…
my bad.

but when i classify.otu, in the summary, my root is becoming 10 428 wile I was hopping to get 14 482 x group.


or it is me getting confuse over the definition of an otu…

looks like I still have to debug some stuff in my overall comprehension after all.

this is definitely becoming a bad idea…! :oops:

Can you run the phylotype command and then make a shared file from the list file that is generated?


ok, I will do that.

i will post back when it is done.


I though phylotype was “evil?” :twisted:

What’s next?

This is what i did thereafter.
classify.otu, using label=1-2-3-4-5-6
making the shared file using label=1-2-3-4-5-6

it gave me one shared file.

sub.sample, using label=1-2-3-4-5-6

lefse, using tax3.subsample

used the tx.3.cons.taxonomy for consensus classifycation of the LEFSE OTU output.

so there is less OTU found by LEFSE compared to if i run tax.1subsample, normal (Well I think, i am getting paranoïd after reading so much posts on the forum…)

but in the tax.3.cons.taxonomy, there is still OTU’s classified down to the genus level. Is that normal?

You would only use the classifications down to the level you are interested in.

Hope this helped,

It did, as always

I really have to find the time to travel and meet you somewhere.

Thanks again for the support.