Grouping samples

I’m processing 5 samples with 4 repetitions each (total of 20 fasta files) with Miseq SOP.
Three of my samples are very heterogeneous… To avoid this problem, I merged the fasta files of my repetitions for each samples (merge.files). I have now 5 big fasta files instead of 20. I’m very satisfied of the final results, but it is now very difficult to compare my communities (UNIFRAC, AMOVA, HOMOVA, etc.) since I lost almost all my degrees of freedom.
I would like to know if there is a way to group my samples at the beginning of the process (before make.contigs) to get a “mean value” of my samples like the merge.files does, but without loosing my degrees of freedom ?
Best regards,

A simplified question could be : Is it possible to group samples when we make a dendogram (tree.shared) ? It would make a clearer dendogram when our samples are pretty heterogeneous like mine.
Thanks! :slight_smile:

I think you may be looking for the merge.groups command,

Hi Newbie here

This hasn’t got much to do with the above topic. I am new at 16s analysis and I was wondering if someone could help me out.

I had 40 samples sequenced with 20 samples sampled from the genital area :slight_smile: of sheep that are sick and 20 samples from the same area of sheep that are “healthy”. Each sample was PCR amplified using two primer sets and the amplicons generated from both primer sets were then pooled for each sample due to the pricey nature of Miseq sequencing. So when I want to analyse my data can I put all of my data through the pipeline at once and slowly group reads into sick and healthy and into the separate primers. Is there a command in mothur that can do this? Or would it be best for me to first separate all of my reads into sick and healthy and both separated in terms of primers. Is there an external program that can do this that anyone knows of? If I had a few reads this would be easy but I will be receiving thousands of reads per sample and that will be very tedious to do manually. Any suggestions?

Welcome to the mothur community! Here are come helpful links to help get you started with mothur:

The first link is Pat’s example using MiSeq data and would be a great place to start. The oligos file,, allows you to include any primers and barcodes so you can separate your samples in the make.contigs command.

Hi Westcott

I have tried what you suggested for grouping according to my primers. so I ran the make.contigs command, i added my oligos file as a parameter. It however gives me an error:

This was my command: make.contigs(ffastq=A1002_S24_L001_R1_001.fastq, rfastq=A1002_S24_L001_R1_001.fastq, processors=8, oligos=oligo.txt)

the error gave me: cannot mix paired primers and barcodes with non-paired or linkers and spacers, quitting.

my oligo file is a tab delimited file and contains this in it:



To remind you I have two amplicons with different primer pairs that i pooled together for each sample. They didnt have an index or anything I assumed that the primers act as an index. How exactly will using the oligos file do this for me. If I add it as a parameter does it group my files according to the primers? Because i know this step trims off the primers if you include the oligos file. but as it trims does it put all the amplicons that had the one primer type together and the other amplicons with the other primer pair in another file? Please help me understand. No one within distance works on Mothur in South Africa and I am really struggling. Would really appreciate a more in depth explanation as to how this function will work for me. I need suggestions on how to separate all of my sequences according to primer pairs.

Thank you in advance for the help.

Hi Auberi,
I am happy to help. It looks like you are running into a bug we had in an older version of mothur. Can you upgrade to our latest version,, and then try this:

Looks like you had a typo and listed the same file for the forward and reverse. Did you mean to type this? Also, you probably want to add pdiffs=2.
This was my command: make.contigs(ffastq=A1002_S24_L001_R1_001.fastq, rfastq=A1002_S24_L001_R2_001.fastq, processors=8, oligos=oligo.txt, pdiffs=2)

my oligo file is a tab delimited file and contains this in it: The format looks good. The error you were getting is related to a bug in an older version of mothur.


Hi Westcott

Yes that was definitely a typing error thank you :). I will download the new version but so long I see you said to add barcode to it anyway, so I will try that.

Just to come back to my previous questions , so will this command group my sequences according to primers and separately?

Also I saw someone also had degenerate primers so they wrote out all the possible combinations, is this necessary or is the pdiffs enough to make up for the degeneracies i have two degeneracies in one primer.

Thank you for the help :slight_smile:

Mothur will assemble your reads, remove the primers and create a group file that assigns the reads to the primer names. When you use the pdiffs parameter, mothur aligns the primer portion of your sequence and the primer. The number of diffs must be less than the pdiffs specified. In Pat’s example analysis’ he uses pdiffs=2.


Ok thank you very much for the quick replies. Much appreciated. :slight_smile: