All sequences in scrap after make.contigs

Hello,

A bit of content. I am having four times two fastq files (R1 and R2). Four times because I have four different libraries each containing approximately ~ 50 samples. Additionally, I have paired primers and paired barcodes.

As a first step, I would like to demultiplex and align the reads by using make.contigs. Here I am getting into trouble, my all sequences are saved into L3_R1.scrap.contigs.fasta. and I wonder what went wrong.

The oligos file looks following:

primer GTTYGATYMTGGCTCAG GCWGCCWCCCGTAGGWGT V1-V2

barcode CTGGATAA CTGGATAA A13

barcode ATAAGGTC ATAAGGTC A14

barcode AATAAGGA AATAAGGA A20

barcode TACTTATC TACTTATC A21

barcode ATCTCAGT ATCTCAGT A22

barcode GTCAACGT GTCAACGT A23

Might this be related to the fact that between barcode and primer there is another sequence (linker)?

However, adding the linker sequence to the oligo file doesn’t really solve the problem.

primer GTTYGATYMTGGCTCAG GCWGCCWCCCGTAGGWGT V1-V2

linker GAGCCGTAGCCAGTCTGC GCCGTGACCGTGACATCG

barcode CTGGATAA CTGGATAA A13

Then, I am getting another error:

mothur > make.contigs(ffastq=L3_R1.fastq, rfastq=L3_R2.fastq, oligos=L3_oligos_file.txt)

Using 16 processors.

[WARNING]: GCCGTGACCGTGACATCG is not recognized as a valid type. Choices are forward, reverse, and barcode. Ignoring BARCODE.

[WARNING]: CTGGATAA is not recognized as a valid type. Choices are forward, reverse, and barcode. Ignoring CTGGATAA.

[WARNING]: A13 is not recognized as a valid type. Choices are forward, reverse, and barcode. Ignoring BARCODE.

[WARNING]: ATAAGGTC is not recognized as a valid type. Choices are forward, reverse, and barcode. Ignoring ATAAGGTC.

[WARNING]: A14 is not recognized as a valid type. Choices are forward, reverse, and barcode. Ignoring BARCODE.

[WARNING]: AATAAGGA is not recognized as a valid type. Choices are forward, reverse, and barcode. Ignoring AATAAGGA.

[WARNING]: A20 is not recognized as a valid type. Choices are forward, reverse, and barcode. Ignoring BARCODE.

[WARNING]: TACTTATC is not recognized as a valid type. Choices are forward, reverse, and barcode. Ignoring TACTTATC.

[ERROR]: cannot mix paired primers and barcodes with non paired or linkers and spacers, quitting.

Making contigs…

What is the right way to make the make.contigs work?

Many thanks! Joanna

Hi Joanna,
Welcome to the mothur community! The make.contigs command is not designed to be used with linkers or spacers. Also, mothur expects linkers and spacers to be for single direction reads only. Here’s a workaround that should help you process your data:

mothur > make.contigs(ffastq=L3_R1.fastq, rfastq=L3_R2.fastq) - assemble paired reads
mothur > trim.seqs(fasta=current, oligos=oligos1.txt, pdiffs=2, checkorient=t) - remove paired primers
mothur > trim.seqs(fasta=current, oligos=oligos2.txt, pdiffs=2, bdiffs=1, checkorient=t) - remove paired spacer and barcodes, create groups file

oligos1.txt:
primer GTTYGATYMTGGCTCAG GCWGCCWCCCGTAGGWGT

oligos2.txt:
primer GAGCCGTAGCCAGTCTGC GCCGTGACCGTGACATCG
barcode CTGGATAA CTGGATAA A13
barcode ATAAGGTC ATAAGGTC A14
barcode AATAAGGA AATAAGGA A20
barcode TACTTATC TACTTATC A21
barcode ATCTCAGT ATCTCAGT A22
barcode GTCAACGT GTCAACGT A23

Kindly,
Sarah Westcott

Thank you, Sarah, for your quick response and willingness to help, I really appreciate it. Unfortunately, the solution doesn’t seem to work. It doesn’t remove the primers just changes the orientation of the sequences. And although, at the end the barcodes are removed (only barcodes, not premiers), I still do not know which sequence belongs to which sample. Do you have any idea how to deal with it further?

I was also wondering, in case of cutadapt, it trims everything before the primer’s sequences. This application is very useful because I do not have to take care of any sequences before it. Is this also possible in mothur?

Best, Joanna

Could you try this?

mothur > make.contigs(ffastq=L3_R1.fastq, rfastq=L3_R2.fastq) - assemble paired reads
mothur > trim.seqs(fasta=current, oligos=oligos2.txt, pdiffs=2, bdiffs=1, checkorient=t) - remove barcodes and paired spacer, create group file
mothur > trim.seqs(fasta=current, oligos=oligos1.txt, pdiffs=2, checkorient=t) - remove paired primers
mothur > list.seqs(fasta=current) - list sequences that passed all trim commands
mothur > get.seqs(group=current, accnos=current) - select reads from group file that passed all trim commands

Alternatively, the pcr.seqs command allows you to trim the reads to the location of the primers.

oligos3.txt:
barcode CTGGATAA CTGGATAA A13
barcode ATAAGGTC ATAAGGTC A14
barcode AATAAGGA AATAAGGA A20
barcode TACTTATC TACTTATC A21
barcode ATCTCAGT ATCTCAGT A22
barcode GTCAACGT GTCAACGT A23

oligos4.txt:
primer GTTYGATYMTGGCTCAG GCWGCCWCCCGTAGGWGT

mothur > make.contigs(ffastq=L3_R1.fastq, rfastq=L3_R2.fastq) - assemble paired reads
mothur > trim.seqs(fasta=current, oligos=oligos3.txt, bdiffs=1, checkorient=t) - remove barcodes and create group file
mothur > pcr.seqs(fasta=current, oligos=oligos4.txt, pdiffs=2, rdiffs=2) - remove primers
mothur > list.seqs(fasta=current) - list sequences that passed all trimming commands
mothur > get.seqs(group=current, accnos=current) - select reads from group file that passed all trimming commands

The first option did not work, still I got the sequences in changed orientation. However, the pcr.seqs seems to work (thanks!), just it doesn’t removed primers but the sequences before it (linkers). Any idea how can I removed the primers in nice way by using mothur? Many thanks, Joanna

Just to make sure we are talking about the same things… Your reads look like this, right?

seq1
barcodeString linkerString primerString…primerString linkerString barcodeString

or

seq1
CTGGATAA GAGCCGTAGCCAGTCTGC GTTYGATYMTGGCTCAG…GCWGCCWCCCGTAGGWGT GCCGTGACCGTGACATCG CTGGATAA

The pcr.seqs command scans each end of the read looking for the primers and trims to that point. I am wondering if we have the linkers and primers mixed up? If you want to send your fastq files to mothur.bugs@gmail.com I can take a closer look for you.

Yes, just without space. Sometimes between barcodes and linkers are two additional bases.

I don’t think this is the case, but maybe you can tell better. I sent you the head -1000 of the fastq files. Thank you again for your help.

Hi Joanna,

Thanks for sending your files. I ran the following commands without issue.

oligos2.txt - partial list of barcodes
barcode CTGGATAA CTGGATAA A13
barcode ATAAGGTC ATAAGGTC A14
barcode AATAAGGA AATAAGGA A20
barcode TACTTATC TACTTATC A21
barcode ATCTCAGT ATCTCAGT A22
barcode GTCAACGT GTCAACGT A23

oligos.txt
primer GTTYGATYMTGGCTCAG GCWGCCWCCCGTAGGWGT

mothur > make.contigs(ffastq=L3_R1.fastq, rfastq=L3_R2.fastq) - assembled 250 reads
mothur > trim.seqs(fasta=current, oligos=oligos2.txt, bdiffs=1, checkorient=t) - found 41 reads from the samples A13,A14,A20,A21,A22,A23
mothur > pcr.seqs(fasta=current, oligos=oligos.txt, pdiffs=2, rdiffs=2) - removes primers from all 41 reads

Kindly,
Sarah

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.