Hi. We sent our samples to MrDNA service for illumina 16S amplicon sequencing. I find very difficult to get the files they uploaded to basepace ready to start running the Mothur Illumina SOP. Did anyone used MrDNA output files before? I could use some help/advice.
They use a system to prepare libraries in which at the end you have sequences with barcodes and in any of both directions in the R1 and R2 files (from the readme file they provide):
- To keep amplification bias to a minimum MR DNA does not use long concatamer primers as part of Illumina data (ie 50bp of linker and barcode and a 20bp primer). We do create actual
libraries out of each of our individual amplicons. This results in the amplicons being found in
both 5’-3’ as usual and 3’-5’ orientation in the r1 and r2 files, this is normal for ligated libraries.
Note the R1 and R2 are both in the 5’-3’ orientation as raw files.
a. Forward primer format BARCODE-FORWARD PRIMER (can be found in R1 and R2)
b. Reverse primer format REVERSE PRIMER (matched pair can be found in R1 and R2)
- Example of R1 and R2 format … standard mixed pair format
R1 file:
Sequence 1 barcode-forward primer-sequence
Sequence2 reverse primer-sequence
Sequence 3 barcode- forward primer- sequence
…etc
R2 file:
Sequence1 reverse primer-sequence
Sequence 2 barcode- forward primer- sequence
Sequence 3 reverse- primer sequence
…etc
I am not sure make.contigs will work with this structure of the R1 and R2 fastq files.
They provide also with the assebled contigs, so I have one .fasta and one .qual file with all the sequences from all samples, reads assembled (contig sequences) all in 5-3 orientation and still with barcodes and primers. If I use this fasta file to start the SOP with the trim.seqs command, how do I create the groups file?
Any help is welcome!
Thank you!