Hi there,
TL;DR version.
- not looking at microbiome, but songbird MHC (highly duplicated genes in this taxon)
- can only use 1 processor for dist.seqs because mothur will use all 24gb of RAM on my PC and shut down if I ask the program to use more than one processor
- have to use 0.0 for cut-off because one SNP may be of importance
- filtered aligned FASTA file has ~2.3 million sequences
- dist.seqs has been running for days, but has stopped at sequence 390,800
- can you designate how much RAM for dist.seq to use, so mothur won’t close? Also, to allow me to use more processors?
I’m using mothur to determine individual MHC genes per individual, in this case, a songbird. In a sense, you can consider each individual as a bacterial community, and each MHC allele as an OTU. Since songbirds have incredibly diverse MHC class II allele repertoires, I need to use a program such as mothur.
So far I have tailored the MiSeq SOP to my MHC analysis, however, I am running into trouble at dist.seqs – the reason being is my computer has 8 processors and 24 gb of RAM, and if I run any higher than 1 processor, it will max out the RAM and shut down mothur. It’s also important to note, that I cannot have a cutoff value as even a SNP may be important in determining allelic variation at MHC, so I had to set it to 0.
Is there any way to tell this command to use a set amount of RAM so it won’t shut down, but it will also speed up? The filtered aligned FASTA file has roughly 2.3 million sequences, and it has stalled at 390,800.
Any help is appreciated.