missing.group

sapeura · January 22, 2010, 6:46pm

Hi!

I’m a new user but I have managed to struggle with my 454-data all the way until clustering is done. But now I don’t know what to do. When doing otu reading, Mothur says that my groups- and list-files have a different number of sequences. The missing sequences should be in missing.group-file and I even got the file, but IT IS EMPTY.

So does anybody know how to make those files mach???

westcott · January 25, 2010, 2:30pm

This can occur if you do not have unique sequence names. mothur requires sequence names to be unique.

pschloss · January 26, 2010, 3:49am

You might also try following the Costello Stool analysis I provide in the analysis examples page of the wiki. Unfortunately, the names for each file get confusing and its easy to get them messed up, which can cause problems.

sapeura · January 26, 2010, 7:34am

So far I’ve been following the Sogin-analysis, but I’ll try to change to Costello analysis.

I figured out what caused the problem (but don’t yet have the final solution…) At least part of the problem was, that when collecting the original sequence data, I had somehow manage to add 160 sequences twice. Now I removed the duplicates, did it all over again, and ended up again with different number of sequences in .list and .group files. And the missign.group is still empty…

verenastarke · January 28, 2010, 9:43pm

Hi there,

I had a similar problem. My message said “Your group file contains 41375 sequences and list file contains 41195 sequences”…
In my analysis I am using several sequencing files from different samples, which I combined in one big file as described in the Sogin-analysis example…which results in a new fasta-file and a groups-file. When I was using the new fasta-file for downstream analysis (such as screen.seqs) I forgot to change the groups-file accordingly. Every time you are removing sequences from your fasta-file you also need to change the groups-file (and name-file when using unique sequences).

Hope this helps.

Cheers,
V

sapeura · January 29, 2010, 7:31am

The problem was, that I had accidentaly added 160 sequences twice into the original fasta-file and when trying to remove those, I apparently used a command in BioEdit, that removed all exact duplicate seqs :oops: . However some of the duplicate names were still left (no idea how that’s even possible; duplicate names should have had exactly the same seqs :shock: ). So ones I manually removed the duplicate names, it all went smoothly.

Topic		Replies	Views
namefile and groupfile mismatch mothur bugs	4	4962	February 20, 2012
groups file out of sync with Costello pipeline Commands in mothur	11	10907	August 30, 2012
where is the missing group(s)? Commands in mothur	2	50419	January 15, 2010
unique.seqs Commands in mothur	1	2813	June 29, 2010
no equal numbers of sequences between name and group file mothur bugs	6	6893	May 5, 2012

missing.group

Related topics