Combining sequence datasets

Hi all,

I have processed two separate MiSeq datasets at separate times, and for the purposes of comparing the communities between them I was hoping to combine both the datasets within mothur without having to go through the palaver of re-processing them. As I gave each dataset distinct .file names initially, achieving this goal seems tricky to me.
Would anyone be able to give some advice on how I could do it, if possible?

Your help would be really appreciated.


You have a few options…

  1. You could make a master files file and run the pipelines over from the beginning
  2. You could merge the outputs of make.contigs
  3. You could merge everything after running align.seqs

Anything else probably isn’t worth the effort.

I’d also encourage you to think about whether you have controls across the sequencing runs that could serve as a control. In these cases, I worry about batch effects due to extracting, PCRing, and sequencing things at different times. You might see differences between the batches that have nothing to do with biology and are due to artifacts. When my lab has a study with multiple sequencing runs we do our best to randomize samples across extractions and sequencing runs.


I have a few projects that are patient based and the researchers want the results periodically so I can’t follow Pat’s wise advice about randomizing samples. We sequence samples as they come in, then I rerun mothur every few months. Initially I was running each set through chimera checking then saving files (unaligning the fasta) that i would combine with the next set after it had gone through chimera checking. I’d realign the concatenated fasta and start from there. Honestly that was more effort than it was worth. I’ve switched to starting completely over each time I add in new sequences including downloading from basespace. Burns a bit of HPC time but much less of my time. You should be sequencing a mock community and neg controls from all runs (though I haven’t seen too much batch effect)

Thanks both, I think starting from the top, perhaps merging the files after running align.seqs. How exactly do I go about this?

Is it not possible to merge the .contigs.good.groups file?

Many thanks,


Not to worry, I have done this with merge.files and it seems to work fine. Thanks! :slight_smile:

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.