Hello! How are you?
I am a new user of both the forum and mothur. First of all I want to say thanks to the authors because the job you are doing is great. I have a pretty basic question, but I just dont know what to do.
I am working with some files from a Roche (454) FLX sequencing run. They look like this:
an-52_GO0Z3GB03HJJ8X bdiffs=0(match) fpdiffs=0(match) rank=0000442 x=2976.5 y=431.0 length=430
I have been trying to convert them to Fasta (just with the sequence name and the sequence itself)…but this simple task is becoming difficult. The formatted file has problems and mothur tells me that the file is empty. I just cant get a fasta file with the right format. Do you have any advise about how to deal with this type of files? I have another big problem…that we don’t have Linux in the lab, so I am limited to Windows options :? …
Thank you for your help!!
Welcome to the mothur community! I would be happy to help. The extra stuff after the sequence name shouldn’t be causing you the empty file error. Mothur treats that information like a comment. You could have a sequence like this:
mySequence I am a really cool sequence, :ugeek: .
Could you tell me what version of mothur you are using and what command gave you the empty file error?
Thank you for your help!
I am using mothur v.1.34.1
I am trying to run split.group. It says:
[ERROR]; your fasta file contains 0 valid sequences and your group file contain 181290, please correct. Did you forget to include the file name?
For that reason I thought that maybe the FASTA was not in the right format…anyways, I got the same ERROR with both the original fasta file and the one I formatted. I dont know if this information is useful for you: I opened the fasta in notepad, just to have a look and I saw that the extra stuff after the sequence name is tabulated. I mean:
an-52_GO0Z3GB03HJJ8X (tab) bdiffs=0(match) fpdiffs=0(match) (tab) rank=0000442 x=2976.5 y=431.0 length=430