Dear Mothur users,
I have been using mothur for my data analysis. Due to limited computation capacity, i have to analyze the data in portions but at the end i would like to look and analyze data from all the samples together. Does anyone has an idea how i can do this? Thanks in Advance.
Dear Mothur users,
How big is your dataset? Are you following the sop (pre.cluster and cluster.split being the key steps for large datasets)?
What you’re asking to do is just phylotype analysis (far inferior to OTU analysis in examining communities) that shouldn’t have much computational requirements at all.
I have done two different analysis on mothur call them R1 and R2. Both runs generate (.shared1 and .taxonomy1) and (.shared2 and .taxonomy2) files. I wanted to know if its possible to combine both .shared and .taxonomy files from this to runs and look at them as one merged .shared and .taxonomy.
I follow the sop as is.
Hi Beth, and everybody
I´m interested in your question, as I have the same problem but with old 454 datasets I am dealing with now (yes, not the best thing to do, but it is what it is!).
I guess that merging the shared and taxonomy files is not the best idea, as may be some OTUs are shared between the two runs or even the sequences in the two runs would end up in different OTUs if the analysis is done on the whole dataset. I consider it better to start the two analysis separated but after the first steps (which are the ones requiring more computational resources) merge the files and continue with the processing of improved sequences with only one (merged) dataset. If aligning to the reference alignment is also too much for the computer, may be the files can be merged after OTU building, but I think that not after that.
I´d be glad to know the opinion of the more experienced people here
You can’t merge files after clustering-you have to cluster things together. Can you give more specs about your data and the computer you have access to?
I have to merge data from old 454 titanium and 454 FLX+ runs to analyze the whole dataset. What I did is treat the sff files individually until trim.seqs and then merged the fasta, names and groups files and continued with the merged files following the 454 sop, from unique.seqs, align.seqs etc. Is that correct?
I have no problem with computer resources, as I have access to a cluster with enough capacity to handle my data.
that’s similar to what I’ve done. I didn’t merge till after chimera checking. But that meant that I had to realign them all together before proceeding