pre.cluster command in MiSeq SOP

I am working through the MiSeqSOP.
When I used the pre.cluster command, some of the output files are empty. Consequently when I then try the chimera.uchime command MOTHUR freezers and says “Fatal Error: Cannot open (file name) No such file or directory”
Any guess to why this is?

I would be happy to help you try to resolve the problem. Could you post the commands you ran, the version of mothur you are using including the OS and whether it is our prebuilt executable or you built from source?

Thank you Wesscott

Mothur version: v.1.36.1 last updated 7/27/2015
Operating system: Windows 7
Data set: Sourced from the MiSeq SOP wiki page
Command: mothur > pre.cluster(fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, count=stability.trim.contigs.good.unique.good.filter.count_table, diffs=2)

The above commands seems to work, however the files generated are empty.
When I then run the command “mothur > chimera.uchime(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table, dereplicate=t)” MOTHUR gives the Fatal Error message.

I have followed the SOP to the letter, except for changing the name of the Silva files, command “mothur > system(mv silva.bacteria.pcr.fasta silva.v4.fasta)”. Even if I change the mv to ‘rename’ as the SOP suggests, MOTHUR still automatically closes, so I skipped that part.

Do you think it could be as simple as using the wring version of MOTHUR?

Hey, I am having the same problem here. I was running all right previously for many times but it gaves me a FATAL ERROR at the chimera.uchime() step because the .fasta file generated by pre.cluster() step does not exist. :shock:

The only thing I changed is the run for this time contains more than 2000 datasets. Is this the cause?

It is very weird.

Any thought? Thank you.

System command:
The SOP use Linux / Mac system system commands. Windows has slightly different names for the same commands. Here are a few links to help you with the conversions. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Step_by_Step_Guide/ap-doslinux.html http://www.covingtoninnovations.com/mc/winforunix.html

mothur > system(rename mothurs_really_long_filename short_filename)

If this isn’t working, here is something to consider. Mothur does not look for path’s in the system command. Did you give the command the full path to the files? If not, are they in your current working directory? http://www.mothur.org/wiki/Frequently_asked_questions#Mothur_can.27t_find_my_input_files

Pre.cluster command:
The fatal error messages could be caused by memory issues. The pre.cluster command is very memory intensive. Are you using multiple processors? How many total sequences in the 2000 samples? How much memory?

The links helped with the renaming step of the Silva files from the SOP.
However I still have not been able to run the uchime command.
I tried again an received this message;
[ERROR]: The number of count files does not match the number of fasta files, please correct.
Back in the ‘Processing improved sequences’ section, while running the unique.seqs and count.seqs commands I received a slightly different output than the SOP.
In the SOP the total number of seqs is 129,058. However I only had 128,872.
Could this be the base of the problem? And is there a way I ca fix it?

Hello,

I’m having the identical issue with the MiSeq SOP: “Back in the ‘Processing improved sequences’ section, while running the unique.seqs and count.seqs commands I received a slightly different output than the SOP. In the SOP the total number of seqs is 129,058. However I only had 128,872.”

I also get exactly 128,872 sequences. And I get the same error message: “[ERROR]: The number of count files does not match the number of fasta files, please correct.”

Any insight into this error?

Update:
I’ve re-run the first part of the MiSeq SOP through my local machine and two remote clusters, and finally it worked on one of the remote clusters. I recovered 129058 sequences from the first screen and everything went smoothly from there. The cluster that worked had mothur/1.34.4 loaded, while the other cluster had 1.25 and my local machine had the newest version 1.36.1. So this may have been issue, but regardless of how many sequences made it through the screen step, why the error? Which files became incompatible through these first steps? We didn’t make any groups, names, or count files until after this first screen? Thank you.

Well for unknown reasons, pre.cluster will work on one computer cluster but not the other even though they’re both running the same version of mothur/1.34.4. The core where pre.cluster did work does not have uchime installed so I can’t proceed to the next step anyway :shock: although I’ve requested they install it.

So, the error I get when I run pre.cluster after following the SOP line by line:
“several sequence names…M00967_43_000000000-A3JHG_1_1101_19678_19803 is not in your count table. Please correct.”

Instead, I tried to run pre.cluster with the names file and not counts:
unique.seqs(fasta=stability.trim.contigs.good.unique.good.filter.fasta, name=stability.trim.contigs.good.names)
pre.cluster(fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, name=stability.trim.contigs.good.unique.good.filter.names, diffs=2)

Instead of an error, I get three files out:
stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta
stability.trim.contigs.good.unique.good.filter.unique.precluster.names
stability.trim.contigs.good.unique.good.filter.unique.precluster.map

But then, chimera.uchime still gives me the same error as it does when I run pre.cluster with the count file:

Unable to open stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table. Trying default /opt/software/mothur/1.34.4–GCC-4.4.5/stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table
Unable to open /opt/software/mothur/1.34.4–GCC-4.4.5/stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table. It will be disregarded.
[ERROR]: You don’t have any saved reference sequences and the reference parameter is a required.

Sooo, I’m out of ideas. The error seems to be independent of which mothur version you’re using or if you get 128872 sequences after the first screen. Need to somehow fix the count file, or the sequence fasta, or both…

Can you upgrade the version of mothur and try again? You’re more than a year behind and some of the problems may have been fixed in recent versions.

Pat