How to determine RAM needed for an EC2 instance


I have a pretty basic question. In the process of trying to use mothur to analyze my own data, I have noticed that I am able to process various subsets of my data, but if I try to analyze all my files at once, I get errors. I am pretty sure that it is because I am not using a large enough EC2 instance (i.e., the point at which I get errors is directly proportional to how many files I have attempted to process in that subset). Is there a basic guideline that could point me to how to choose the best size instance for my dataset?

Thanks so much!

Hey Carmella,

Sorry, but it’s actually not that easy of a question :). We generally use ~48 GB of RAM and are fine. The clustering commands may use more than this, but unfortunately, it’s a bit of trial and error. Generally using fewer processors will also use less RAM.