FASTQC contains both 16S and 18S with identical barcodes

Hi a friend of mine asked me to help him process his Ion Torrent sequences. He provided me with a single fastq file and a excel sheet containing sample ID, barcodes and primers. I was planing on using trim.seqs in order to demultiplex the sequences but run into a problem due to the sample set-up.

I realised that the fastq contains both 16S and 18S sequences. My friend has sequenced 8 samples twice, once with 16S primers and second time with 18S primers and they are all in one fastq file.

The 16S and 18S samples have identical barcodes, but the primer sequence is different. Is there a way to split the fastq file so I can demultiplex and process the 16S set and the 18 set separately? Can I do it in mothur or should I use “split” or “grep” commands to search for the different primer sequences and split the file based on that?

Thank you very much for your help.

All the best.

In your oligos file, just provide the 16S or 18S primer sequence. After running trim.seqs you should rename the output to indicate which gene’s data is in the file.

Pat

Dear Pat,

thank you so much for such a quick response. If I understand it correctly I am supposed to run the trim.seqs command once with the 16S primer and once with the 18S?

I created the oligo file for the 16S and the 18S sample. Below showing for the 16S.

forward AGAGTTTGATCMTGGCTCAG
#reverse
barcode CTAAGGTAAC BS
barcode TAAGGAGAAC Ps
barcode AAGAGGATTC SS
barcode TACCAAGATC T_05
barcode CAGAAGGAAC T_08
barcode TTCGTGATTC S_6
barcode TTCCGATAAC S_7

I ran the trim.seqs as below:

mothur > trim.seqs(fasta=x.fasta, qfile=x.qual, oligos=16S.oligos)

The original fastq contains over 5 million reads, but after running trim.seqs, I ended up only with 60+ sequences and everything else went into the scrap file.

Have you tried it with pdiffs=2, bdiffs=1?