Help with make.contigs with index and oligos files

Hi
I got the results from a sequencing provider. I was given the raw R1 and R2 fastq files and also an oligos.oligos file and two index files (R1 and R2) -see below

I should also add checkorient=t as both R1 and R2 can contain primers and barcodes.

The oligos file is correct as they sent it? or should I add the primer sequences?

How should I run the make.contigs command?
Thank you!!
Susi

oligos.oligos file

BARCODE TCGCGGCT NONE R1.2.T0
BARCODE TCGCTAAG NONE R1.2.T1
BARCODE TCGCTTAA NONE R1L.T2
BARCODE TCGGGCCA NONE R1L.T3
BARCODE TCGCTGGC NONE R1S.T2
BARCODE TCGGGCAC NONE R1S.T3
BARCODE TCGGCAGA NONE R2L.T2
BARCODE TCGGTTGC NONE R2L.T3
BARCODE TCGGATCC NONE R2S.T2
BARCODE TCGGTGAC NONE R2S.T3
BARCODE TCGCGTCA NONE R3.T0
BARCODE TCGCTCAC NONE R3.T1
BARCODE TCGGCGTG NONE R3L.T2
BARCODE TCGTATGT NONE R3L.T3
BARCODE TCGGCCAC NONE R3S.T2
BARCODE TCGTAGTC NONE R3S.T3

R1 index file (first lines):
@M02542:133:000000000-CL9CW:1:1101:20180:1189 2:N:0:CCGTCC
TCGTATGT
+
-ACCCGGD
@M02542:133:000000000-CL9CW:1:1101:15048:1195 2:N:0:CCGTCC
TCGGTGAC
+
<CC<CFFG
@M02542:133:000000000-CL9CW:1:1101:8828:1199 2:N:0:CCGTCC
TCGCGTCA

R2 index file (first lines):
@M02542:133:000000000-CL9CW:1:1101:20180:1189 2:N:0:CCGTCC
TCGTATGT
+
-ACCCGGD
@M02542:133:000000000-CL9CW:1:1101:15048:1195 2:N:0:CCGTCC
TCGGTGAC
+
<CC<CFFG
@M02542:133:000000000-CL9CW:1:1101:8828:1199 2:N:0:CCGTCC
TCGCGTCA

From the oligos it looks like there are no reverse barcodes, but you indicated you have a forward and reverse index files, which is a bit confusing, as that would indicate you do have reverse barcodes to remove. Could you clarify this?

If you want to remove the primers they should be included in the oligos file as well.

If you want to run make.contigs with a paired fastq files and paired index files you would run the following:

mothur > make.contigs(ffastq=yourR1FastqFile, rfastq=yourR2FastqFile, findex=yourR1IndexFile, rindex=yourR2IndexFile, pdiffs=2, bdiffs=1, oligos=yourOligosFile)

The oligos file should look like:

primer forwardPrimer reversePrimer
barcode TCGCGGCT reverseBarcode(found in R2 Index) R1.2.T0

1 Like

Hi Sarah
The provider said nothing about using barcodes in forward only or in both primers. Actually the information he sent is confusing. The instructions said:

“FORMAT FOR QIIME OR MOTHUR: Typically, our fastq files are fully RAW and untouched and unbinned etc. on basespace. For applications like Qiime2 or Mothur, you may wish to have individual fastq and index fastq files. EASY, on our website www.mrdnafreesoftware.com the fastq processor application can easily perform this task.”
So I did, and got the oligos and index1 and index2 files.

Then, they mention:
*"Methods of MiSeq or hiseq or novaseq6000 when run with amplicons! *
The 16S rRNA gene V4 variable region PCR primers 515/806 (OR whichever PRIMER set the investigator has SELECTED) with barcode on the forward primer were used in a 30-35 (depends on primers and DNA. Most studies use 30 cycles. Send inquiries to MR DNA as needed) PCR (5 cycles used on…"
So, I guess, they used barcodes only in the forward primer.

Last, they say:
*"FORMAT OF RAW amplicon data on BASESPACE *
1. To keep amplification bias to a minimum MR DNA does not use long concatamer primers as part of Illumina data (ie 50bp of linker and barcode and a 20bp primer). We do create actual
libraries out of each of our individual amplicons. This results in the amplicons being found in
*both 5’-3’ as usual and 3’-5’ orientation in the r1 and r2 files, this is normal for ligated libraries. *
*Note the R1 and R2 are both in the 5’-3’ orientation as raw files. *
a. Forward primer format BARCODE-FORWARD PRIMER (can be found in R1 and R2)
*b. Reverse primer format REVERSE PRIMER (matched pair can be found in R1 and R2) *
*Example of R1 and R2 format … standard mixed pair format *
R1 file
Sequence 1 barcode-forward primer-sequence
Sequence2 reverse primer-sequence
Sequence 3 barcode- forward primer- sequence
*…etc *
R2 file
Sequence1 reverse primer-sequence
Sequence 2 barcode- forward primer- sequence
*Sequence 3 reverse- primer sequence *
…etc "
I conclude they use barcodes only in the forward primer, but they mention to use their raw data with Mothur we should get those oligos, index1 and index2 files, which I got and then couldn’t realize how to use to make.contigs.

Thank you!

Here again. In case you want to look at both index files, here the link to my drive:


Thanks for the additional information. It looks like there are 2 index files provided because of this:

I would try running the following:

mothur > make.contigs(ffastq=yourR1FastqFile, rfastq=yourR2FastqFile, findex=yourR1IndexFile, pdiffs=2, bdiffs=1, oligos=yourOligosFile, checkorient=t)

Oligos File:

primer GTGCCAGCMGCCGCGGTAA GGACTACHVGGGTWTCTAAT V4
BARCODE TCGCGGCT NONE R1.2.T0
BARCODE TCGCTAAG NONE R1.2.T1

BARCODE TCGTAGTC NONE R3S.T3

Kindly,
Sarah

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.