Not a valid group error

Running mothur 1.27.0 on Ubuntu 11.04.

I’ve run into a problem with unifrac.unweighted not recognizing the groups I want it to compare. My “final.xxxx” files are all equivalent to the “final.xxxx” files produced by the Schloss SOP wiki page. At the point this script starts, I have 4 sets of 50 groups, so the group names are 9410.1, 9410.2, 9410.3, 9410.4, 9412.1, 9412.2, etc, where the “.x” at the end of the group name indicates which set the group belongs to. I want to use the unifrac commands to compare all four 9410.x groups to each other.

My script looks like this:

Dist.seqs(fasta=final.fasta, cutoff=0.15, processors=2)
Cluster(column=final.dist, name=final.names)
Make.shared(list=final.an.list, group=final.groups, label=0.03)
Count.groups(group=final.groups)
Sub.sample(shared=final.an.shared, size=900)
Tree.shared(shared=final.an.0.03.subsample.shared, calc=thetayc-jclass)
Unifrac.unweighted(tree=final.an.0.03.subsample.thetayc.0.03.tre, groups=9410.1-9410.2-9410.3, random=t)

Everything appears fine until the unifrac command when I get an error saying “9410.1 is not a valid group, and will be disregarded.” The error is repeated for all groups and the command defaults to doing a global comparison.
9410.4 is too small and removed by sub.sample, so I removed it from the unifrac command thinking it might be causing a problem. Didn’t help. I used get.group() and all 3 of the groups that the unifrac command calls invalid show up. All groups also show up in the .tre file as well. Any thoughts?

I suspect that what you really want to do is…

sub.sample
dist.seqs
clearcut
unifrac

What you’re currently doing is running unifrac on four groups with no replication.

Pat

I knew it wasn’t right, but that was as close as I could come. Thanks for the input.

If I define a “group” as 9410, 9412, etc, will this approach allow me to determine whether or not all of the 9410.x samples are more similar to each other than to the 9412 samples?

That’s right - if you have a design file you could put all the 9410.x, 9412.x, etc. group names in the first column and then 9410, 9412, etc. in the second. Then what you had should work.

Pat