mothur

File mismatch detected, quitting command

Hello everybody, I am working with the latest version of mothur and I get a problem that drives me crazy for several days. I executed the command:

mothur > summary.seqs(fasta=otus_lequia.align, count=otus_lequia.count_table)
Using 8 processors.
[ERROR]: ‘OTU2610’ is not in your name or count file, please correct.
[ERROR]: Your name file contains 5312 unique sequences, but your fasta file contains 3050. File mismatch detected, quitting command.

Therefore I removed the OTU2610 from my fasta file, I repeated the align command using this fasta file and the silva database after I checked the number of sequences using list.seq comand for fasta, count_table and names files and it appears to be the same: 4107 OTUs. At least is what I see opening the list.seq files using excel…

However I get the same error, high number of sequences in the count file than in the fasta file and more surprisingly, this number is changing for the count file if I repeat the command…

mothur > summary.seqs(count=otus_lequia.count_table)
Using otus_lequia.align as input file for the fasta parameter.
Using 8 processors.
[ERROR]: ‘OTU2610’ is not in your name or count file, please correct.
[ERROR]: Your name file contains 5312 unique sequences, but your fasta file contains 3077. File mismatch detected, quitting command.

mothur > summary.seqs(fasta=otus_lequia.align, count=otus_lequia.count_table)
Using 8 processors.
[ERROR]: Your name file contains 5312 unique sequences, but your fasta file contains 4106. File mismatch detected, quitting command.

Please, I would be very grateful if anyone could help me,

Many thanks,

Frederic

Hi Frederic, the problem is not OTU2610, it is that you are running summary.seqs with fasta and count files that don’t correspond to each other.

Please first check with summary.seqs on the fasta and count files separately, (rather with list.seqs and Excel).

e.g. run summary.seqs(fasta=X.fasta) on the fasta file you want to use in your align command, and make a note how many seqs are in your fasta file.

Then run summary.seqs(count=X.count_table) -to check how many uniques and how many total seqs you have in the count table only.

If the number of seqs in the fasta from the first summary.seqs and the number of uniques in the count from the second summary.seqs aren’t the same, you will always get an error if you try to run summary.seqs with these mismatching files together.

If the numbers don’t match it will likely be because either an incorrect file was used in the creation of the count_table, or some data has been discarded from the fasta file during a previous step and the matching count_table wasn’t updated.

Many thanks Emma, I will check it. I think the problem arises because I used first Usearch for generating the contigs and improving sequence quality and some sequences were deleted from the fasta file but not from the count.table. I used the fasta and count.table files to continue with Mothur since I feel more confortable with it. The reason I used Usearch just to generate contigs is that Mothur generated to many OTUs following the MiSeq SOP from the beginning.

Well, once you identify the step at which you have a mismatch, updating the files to match each other is quite easy:
list.seqs(fasta=X.fasta)
then get.seqs(accnos=output from the above command, count=x.count_table)
Then you can check the fasta with the updated count table with summary.seqs and there should be no problems
Cheers,

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.