I’m trying mothur for few days now and I think that it’s a really nice peace of software.
Maybe I missed this, but it would be wonderful if Mothur would give a fasta file like the output of the get.oturep function but for each individual habitat in a multisample analysis. For example, this file can be use in downstream analyses (after insertion of sequences in arb and unifrac analysis) to get the unifrac distance matrix between habitats. Using the actual get.oturep function, I miss some representative otus that are share between habitats.
Another solution would be to get the unifrac distance matrix directly from mothur but I don’t know if it’s possible yet?
Have you found the get.sharedseqs command yet (http://www.mothur.org/wiki/Get.sharedseqs)? It’s a new command that I think does what you’re after. You can use it to find sequences that are unique to particular groups as well.
As for unifrac, my current policy is to not output a unifrac-generated distance matrix because my data suggests it doesn’t represent what unifrac proponents suggest. I published this in ISMEJ (http://www.nature.com/ismej/journal/v2/n3/abs/ismej20085a.html) last year, but it doesn’t seem to have been noticed yet. And I would love for people to engage me on this instead of ignoring it. Furthermore, I question the value of trees generated using pyrotags. I guess you could say that I think I’m trying to save people from themselves. I realize this is probably pig-headed of me, but you can probably cobble together the matrix on your own with the output we give you from unifrac.weighted and unifrac.unweighted.
Hope this helps,
Thanks you very much the reply and sorry for the delay in mine.
The new Get.sharedseqs function is quite interesting, however i think that the get.oturep function is more adapted to this problem since I’m interested only by representative sequences of each otu. Of course, I could use the Get.sharedseqs function but it would not be very user friendly in this case (many habitats).
Currently I’m comparing the microbial community of various lakes using ribosomal and functional genes. To calculate diversity indices, i join the fasta of each lakes, make a group file and run mothur on this file. But to get the representative OTU for each individual fasta, I still have to run mothur on individual fasta of each lake because I miss the shared ones in the multisample analysis. I still think that it would be very usefull to obtain the same file generated by the get.oturep function but for each habitat (maybe using the group parameter). The rarefaction function do this in multisample analysis as you get an individual rarefaction for each habitat. It would be great to have the same for the get.oturep function.
To be continued…
In version 1.12.0, we will be adding a groups option which will allow you to get a representative sequence from each group for each OTU. Thanks for the suggestion!
Waiting impatiently for the new version…