Sorry if this question was answered before: I am looking for a command that gives me the number of sequences in each OTU and its taxonomy (not consensus taxonomy).
i.e. OTU-ID - No. of sequences - assigned taxonomy
I know the basis=sequence and basis=otu option in classify.otu but those are giving me only either the number of sequences or OTU per consensus taxonomy.
Big thanks in advance!
Sorry, I have asked this question totally wrong. :oops:
I was looking for the No. of sequences within each OTU and its taxonomy per sample. With persample=true, right?
Now, I have started classify.otu using the persample parameter but it is already running since a week and needs nearly all of our RAM (128 GB). The samples have only 100,000 sequences together, so not really large. Are these amount of time and RAM needed common? :?
Thanks in advance!
I wouldn’t use the persample option. You might want to try this…
Thank you Pat!
For this, I run get.oturep with column, name, group, list and label=0.03. But it is reading the matrix since 2 days now.
If you or anyone else would have any suggestions I would be very grateful.
If I understand you correctly, all what you need is to open your shared file and copy it to Excel. This should show you the number of sequences per OTU per group. I also classify otu on the basis of sequences and I add group option. His gives you the OTU and classification you need. I usually merge the two files in one Excel i.e. The shared and taxonomy files. And there is so many you can do with that file. Note: you might need to transpose the shared file before you open it.
I hope that helps.
In get.oturep instead of using a distance matrix, use the method=abundance approach