I am using make.biom with the PICRUSt option to produce .biom file with a PICRUSt OTU table, which contains PICRUSt-able greengenes phylotypes and their abundances in samples. In total, I’m getting 907 phylotypes.
The problem: the vast majority of phylotype abundances have a perfect positive linear pairwise correlation, and the rest are very very close. Is this normal? I am also posting this question on the PICRUSt forum: https://groups.google.com/forum/#!topic/picrust-users/fpqC6_VHEOM
Not sure what you mean by this…
the vast majority of phylotype abundances have a perfect positive linear pairwise correlation, and the rest are very very close. Is this normal? I
Can you clarify?
I use a subsampled shared file for input into make.biom with the picrust option. Part of the output biom file is a new PICRUSt-ed shared file, where OTUs go by columns and are identified as numbers (presumably from greengenes database). Samples are rows.
In that new PICRUSt-ed shared file:
- OTUs or column abundances have perfect pairwise correlations across samples. In other words, plot one column again the other and get a line.
- Also, abundances in columns no longer add up to the same number, making it look like the subsampling has disappeared. I wonder whether that’s relevant, perhaps things need to be subsampled again…?
Does this help to clarify?
Could you send your input files to mothur.bugs@gmail.com?