I am trying to use the make.sra command in order to submit the necessary files to NCBI. I am running into an issue because practically all the sequences wind up in the scrap document.
Here is the basic path I followed. I sequenced using Illumina MiSeq and combined the forward and reverse reads (that were in two fastq files) using make.contigs. I combined the resulting fasta and qual files for the reads back into a fastq file using make.fastq. I made an oligo file with the barcodes and the subset of samples that I wanted in the final file. Then I created a mimark file (.tsv) using that oligo file using the make.mimark command and then went in and put in the necessary information. Finally, I created a project file (.project) and put in the necessary information for that. Then I used those four files in the make.sra command.
I am not sure which part is causing the confusion or if I have used an incorrect file type. Any advice would be greatly appreciated.
There is a comment about the quality scores output by make.contigs at the bottom of the corresponding wiki page. If I understand it well, it’s not correct to re-make a fastq file from them.
Thanks for your reply. Then how do I input a fastq file?
Unfortunately, I don’t know… I am trying to prepare a dataset to submit to the SRA and I can’t figure out how to do it… I got the metadata file OK, but for the sequence file itself I’m stuck.
You are correct the quality scores outputted by the make.contigs command cannot be used to create a new fastq file. The make.sra file is expecting the fastq files you inputted into make.contigs. Have you tried that?
I see. I guess my question now is how I go about separating out the fastq files that I put into make.contigs. I have a couple of projects that I sequenced all at once and I needed to sort through them by the barcodes.
Thanks for your response.
Sarah’s answer to my post should help you How to prepare data for SRA submission? You need to prepare the list of sequences to exclude (.bad.accnos) from your fastq files and use remove.seqs to delete them from your fastq files (with the parameter accnos=.bad.accnos). remove.seqs will output a *.pick.fastq that you can use with make.sra and submit to SRA.
But I think that only removes certain sequences. I want to remove entire groups when they aren’t in their groups yet.
To remove entire groups in make.contigs or make.sra, set the group name to ‘ignore’ in the oligos file.
Thanks. That helps me to sort out how to remove groups.
So now I have a file that lists my forward fastq file, my reverse fastq file, the fastq index file (that shows what barcodes go with what sequence), and then states ‘NONE’; I have an oligo file that says ‘barcode’, then the sequence for that barcode, and then either says ‘ignore’ or the group name; I have a project file; and finally I have the mimark .txt file.
However, when I try to run it says ‘[ERROR]; Could not open’ over and over and everything winds up in the scrap folder. I am worried that this is because of the file that lists my various fastq files. I only have 1 line and I am not sure this is correct.
Thanks again for your help thus far.