Number of sequences per OTU


Sorry if this question was answered before: I am looking for a command that gives me the number of sequences in each OTU and its taxonomy (not consensus taxonomy).

i.e. OTU-ID - No. of sequences - assigned taxonomy

I know the basis=sequence and basis=otu option in classify.otu but those are giving me only either the number of sequences or OTU per consensus taxonomy.

Big thanks in advance!

classify.otu ?

Sorry, I have asked this question totally wrong. :oops:

I was looking for the No. of sequences within each OTU and its taxonomy per sample. With persample=true, right?

Now, I have started classify.otu using the persample parameter but it is already running since a week and needs nearly all of our RAM (128 GB). The samples have only 100,000 sequences together, so not really large. Are these amount of time and RAM needed common? :?

Thanks in advance!

I wouldn’t use the persample option. You might want to try this…

Thank you Pat!

For this, I run get.oturep with column, name, group, list and label=0.03. But it is reading the matrix since 2 days now. :confused:

If you or anyone else would have any suggestions I would be very grateful.

If I understand you correctly, all what you need is to open your shared file and copy it to Excel. This should show you the number of sequences per OTU per group. I also classify otu on the basis of sequences and I add group option. His gives you the OTU and classification you need. I usually merge the two files in one Excel i.e. The shared and taxonomy files. And there is so many you can do with that file. Note: you might need to transpose the shared file before you open it.
I hope that helps.

In get.oturep instead of using a distance matrix, use the method=abundance approach