I’m a new user but I have managed to struggle with my 454-data all the way until clustering is done. But now I don’t know what to do. When doing otu reading, Mothur says that my groups- and list-files have a different number of sequences. The missing sequences should be in missing.group-file and I even got the file, but IT IS EMPTY.
So does anybody know how to make those files mach???
This can occur if you do not have unique sequence names. mothur requires sequence names to be unique.
You might also try following the Costello Stool analysis I provide in the analysis examples page of the wiki. Unfortunately, the names for each file get confusing and its easy to get them messed up, which can cause problems.
So far I’ve been following the Sogin-analysis, but I’ll try to change to Costello analysis.
I figured out what caused the problem (but don’t yet have the final solution…) At least part of the problem was, that when collecting the original sequence data, I had somehow manage to add 160 sequences twice. Now I removed the duplicates, did it all over again, and ended up again with different number of sequences in .list and .group files. And the missign.group is still empty…
I had a similar problem. My message said “Your group file contains 41375 sequences and list file contains 41195 sequences”…
In my analysis I am using several sequencing files from different samples, which I combined in one big file as described in the Sogin-analysis example…which results in a new fasta-file and a groups-file. When I was using the new fasta-file for downstream analysis (such as screen.seqs) I forgot to change the groups-file accordingly. Every time you are removing sequences from your fasta-file you also need to change the groups-file (and name-file when using unique sequences).
Hope this helps.
The problem was, that I had accidentaly added 160 sequences twice into the original fasta-file and when trying to remove those, I apparently used a command in BioEdit, that removed all exact duplicate seqs :oops: . However some of the duplicate names were still left (no idea how that’s even possible; duplicate names should have had exactly the same seqs :shock: ). So ones I manually removed the duplicate names, it all went smoothly.