First, apologies if this is not the best index for this post.
I am not sure if anyone has investigated taxa co-occurrence but I have been trying to perform this type of analysis at different OTU definitions for my dataset.
Briefly, I have 15 samples.
I start with an OTU by sample matrix, eliminate singletons and change the matrix to presence (1) and absence (0) data only (no abundance). This is seen in the first panel of the attached figure. Samples along the horizontal, and OTU’s along the vertical. Again, I do this at multiple OTU definitions
I then wrote a perl script to create a matrix that calculates all pair-wise comparisons for each OTU’s (second panel). The number of pair-wise comparisons is roughly equal to the number of OTU’s (N) squared, divided by 2. N^2/2. As you can imagine, this number gets fairly large.
finally, using another perl script, I calculated the total cases of presence-presence (PP), presence-absence (PA), absence-presence (AP) and absence-absence (AA) for each pairwise comparison. Since I have 15 samples, the horizontal marginal totals will always equal 15. (see third panel in attached figure). Since many taxa are rare the AA is relatively high, especially at more stringent OTU definitions (100% e.g.)
So, what I am looking for is statistically meaningful taxa combinations where, for example, taxa co-occur more than expected by chance or never co-occur more than expected by chance. A similar analysis was performed by Chaffron et al 2010 http://genome.cshlp.org/content/20/7/947. They used Fisher’s exact test to choose meaningful OTU pairs.
My problem is that I am not sure how to implement this test OR whether I have enough data to perform such an analysis. If anyone is interested in discussing this please let me know, or if anyone has any ideas on this analysis. It is very common in macro-ecology studies. Also, is this an analysis that mothur would be interested in implementing?
Upload not implemented: Untitled.png