pca running out of memory

jontarn · April 21, 2014, 6:52pm

hi guys;

I’ve been trying to use my shared files to run pca. I have 4 groups with a total of 45k otus and 3 million seqs between them and I’m running out of memory every time on a fairly well stocked PC. I asked my coworker whos done analyses on these dataset sizes before in pca with mothur and she said she never ran into this problem. Is this true? Is something wrong with my process or do I simply need to access a supercomputer terminal?

danieln · April 22, 2014, 7:53am

Maybe you can try grouping your OTUs by filter.shared function to simplify your dataset before PCA?

pschloss · April 24, 2014, 10:14am

So do you really want PCA? Why not run dist.shared with an ecological distance calculator and then run that through the pcoa or nmds commands. I think there’s generally good agreement that PCA is not appropriate for OTU data.

jontarn · April 24, 2014, 9:30pm

I’ve already done both.

what is pca useful for specifically? My PI really wants me to have a pca chart for whatever reason haha.

pschloss · April 25, 2014, 5:22pm

for normally distributed data? sometimes people slip and use both interchangeably and don’t really know the difference…

jontarn · April 25, 2014, 8:59pm

thanks! so let’s say i wanted to graphically represent the similarity in terms of community composition (richness included) between groups, what would be the best way to show this?

also, I’m not a stats guy, so let’s say I take for example this pca chart:

how is the data being inputted normally distributed? Thanks for your time.

pschloss · April 29, 2014, 1:01pm

also, I’m not a stats guy, so let’s say I take for example this pca chart:

Thanks for making my point This is not a “pca chart” - it’s a “PCoA chart”. The input data would need to be normally distributed and not be sparse for PCA to make sense. PCA is really a specific case of PCoA where the PCA distance matrix is essentially the correlation matrix.

jontarn · May 10, 2014, 11:24am

haha.

i see i see. thank you so much.

Topic		Replies	Views
pca timeout mothur bugs	4	995	December 21, 2017
pca commands Commands in mothur	15	10987	May 3, 2013
PCoA analysis Commands in mothur	5	2490	August 18, 2016
about the PCoA Commands in mothur	2	47092	February 6, 2010
non-sequence data Commands in mothur	1	583	May 21, 2018

pca running out of memory

Related topics