more sequences in groupfile than in name file

Hello
I am using version mothur v.1.25.1

When running pre-cluster, I had the following message “[ERROR]: Your name file contains 80674 valid sequences, and your groupfile contains 196017, please correct.”


Here is what I did:

trim.flows(flow=FZA0Z4N02.flow, oligos=FZA0Z4N02.oligos, minflows=200, pdiffs=2, bdiffs=1, processors=2)

shhh.flows(file=FZA0Z4N02.flow.files, processors=2)

trim.seqs(fasta=FZA0Z4N02.flow.shhh.fasta, name=FZA0Z4N02.flow.shhh.names, oligos=FZA0Z4N02.oligos, pdiffs=2, bdiffs=1, maxhomop=8, minlength=200, flip=T, processors=2)

unique.seqs(fasta=FZA0Z4N02.flow.shhh.trim.fasta, name=FZA0Z4N02.flow.shhh.trim.names)

align.seqs(fasta=FZA0Z4N02.flow.shhh.trim.unique.fasta, reference=silva.bacteria.fasta, flip=t, processors=2)

screen.seqs(fasta=FZA0Z4N02.flow.shhh.trim.unique.align, name=FZA0Z4N02.flow.shhh.trim.names, group=FZA0Z4N02.flow.shhh.groups, end=37693, optimize=start, criteria=85, processors=2)

filter.seqs(fasta=FZA0Z4N02.flow.shhh.trim.unique.good.align, vertical=T, trump=., processors=2)

unique.seqs(fasta=FZA0Z4N02.flow.shhh.trim.unique.good.filter.fasta, name=FZA0Z4N02.flow.shhh.trim.good.names)

pre.cluster(fasta=FZA0Z4N02.flow.shhh.trim.unique.good.filter.unique.fasta, name=FZA0Z4N02.flow.shhh.trim.unique.good.filter.names, group=FZA0Z4N02.flow.shhh.good.groups, diffs=2)

I think this is the problem…

screen.seqs(fasta=FZA0Z4N02.flow.shhh.trim.unique.align, name=FZA0Z4N02.flow.shhh.trim.names, group=FZA0Z4N02.flow.shhh.groups, end=37693, optimize=start, criteria=85, processors=2)

You want…

screen.seqs(fasta=FZA0Z4N02.flow.shhh.trim.unique.align, name=FZA0Z4N02.flow.shhh.trim.names, group=GQY1XT001.shhh.groups, end=37693, optimize=start, criteria=85, processors=2)

thanks, I tried again but it did not work.
I will try next from the beginning with the latest version of mothur.
Thanks again for a great program and also thanks for the wiki and forum.

I tried again with mothur v.1.26.0, but still I have the same error message:
“[ERROR]: Your name file contains 80669 valid sequences, and your groupfile contains 196009, please correct.”

I tried with

pre.cluster(fasta=FZA0Z4N02.shhh.trim.unique.good.filter.unique.fasta, name=FZA0Z4N02.shhh.trim.unique.good.filter.names, group=FZA0Z4N02.shhh.good.groups, diffs=2)

or with

pre.cluster(fasta=FZA0Z4N02.shhh.trim.unique.good.filter.unique.fasta, name=FZA0Z4N02.shhh.trim.unique.good.filter.names, group=FZA0Z4N02.shhh.groups, diffs=2)

Not sure that I understood your advice to use “group=GQY1XT001.shhh.groups”, I don’t have this file.

I guess the problem comes from screen.seqs, because that’s where the good.groups file comes from.

After you run trim.seqs, what is the name of the group file that is created?