Get.groups(shared=) -- making a shared file that merges sample ids with grouping factors to use as the input for get.groups command

Hi I’m trying to see if I can use the get.groups option to separate out my samples, I’ve checked out the get.groups wiki page and downloaded the folder for the examples. However, the wiki example of using get.groups with a shared file uses a “abrecovery.fn.shared” file which I can’t seem to find in the example folder I’ve downloaded. I’m not sure the layout of this shared file and how it would attach the OTUs to the sample ids as well as their grouping factors.

In my own work I’ve generated a shared file of my samples following subsampling and here is where I would like to separate out my groups of data.

I’ve run through the pipeline up to with my samples ids as 1, 2, 3, 4 ect.
My sample ids correspond to different grouping factors ie
1 – A
2 – B
3 – C

I can’t seem to figure out how to take the shared file that associates my samples with their OTUs (ie stability.opti_mcc.subsample.shared from the SOP) and then create a file that links up my sample OUTs from their ids with their grouping factor so I can then use the get.groups command and generate a new shared file that has removed the samples associated with the grouping factors I don’t want.

Many thanks!!


To run get.groups/remove.groups you would need to give it the actual group names as found in the shared file. If you run count.groups(shared=my_shared.shared) that should output the number of sequences in each group, but more importantly give you the names of the groups. From there you can make your own accnos file to grab the samples you want.


OK I’m so sorry, I think there’s probably a really easy way to do this and I’m just not getting it, I’m really sorry.

So right now my samples are “grouped” currently as their identifiers ie 1, 2, 3, 4 ect as opposed to the treatment they are in A B or C. So the way I’m using the remove.groups command right now I have to put in each sample id (1-2-3-4 ect) into the command ie

[remove.groups(shared=stability.subsample.shared, groups=1-2-3-4-ect)].

I’m just wondering if there is a way to create a shared file with the sample ids associated with the treatments so that instead of inputting each sample id into the command I can just do something like:
remove.groups(shared=stability.subsample.shared, groups=A-B) because I’m worried I might miss samples if I identify each sample individually.

The wiki example definitely does this in the shared option on the page:

[mothur > remove.groups(shared=abrecovery.fn.shared, groups=B-C)]

but I can’t find that shared file (abrecovery.fn.shared) in the example folder to see how it is set up. My shared files are all the individual sample ids with their OTUs, not grouped into A vs B vs C. Can I just go into a shared file make an additional column with the groups each sample is in??

Sorry Dr. Schloss, so many questions I know!! I really, really, really appreciate the help!! I’ll get this figured out one day… hopefully…


This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

The design file allows you to assign samples to treatments. It’s similar to a group file. You can use the design file with sets to remove all groups associated with a treatment. To use the design option, follow this example:

mothur > remove.groups(shared=final.shared,, sets=B)

final.shared might look like:

label Group numOtus Otu01 Otu02 Otu03 Otu04 Otu05 Otu06 Otu07 Otu08 Otu09 Otu10 …
0.10 forest 55 0 5 2 3 1 1 3 3 1 0 …
0.10 pasture 55 7 2 5 1 3 2 0 0 1 2 …
0.10 garden 55 0 5 2 3 1 1 3 3 1 0 …
0.10 swamp 55 7 2 5 1 3 2 0 0 1 2 … might look like:

group treatment
forest B
pasture B
garden C
swamp C