Hello, good morning.
I am helping a colleague to analize MiSeq data. I have 12 libraries encompassing 50 pooled samples each. The data was generated targeting the V3-V4 (341F - 805R) region and using barcodes to mix the experimental samples.
I built the oligos file for each of the libraries since the SOP established at the lab, so I did the make.contigs step for each individual library. My problem is that once the command runs, a varying % of sequences are demultiplexed and merged into contigs successfully, but most of them end in the scrap file (e.g: 5224 seqs in contigs.fasta vs 1392786 seqs in scrap; 25131 vs 1616841, …). So I intuit the oligos file works since the alignment of reads is being effective for some of the 50 samples per pool.
Both primers and barcodes are paired. Here is an example of how my oligos file looks like:
Interestingly, for 10 out of the 12 pools, barcode #50 is the one that has more success making contigs. Do you think it could be an issue with how I built the oligos file? I ran the following command:
make.contigs(ffastq=/home/lemv/fastq/FASTQ/Mos1_S4_L001_R1_001.fastq,rfastq=/home/lemv/fastq/FASTQ/Mos1_S4_L001_R2_001.fastq, oligos=/home/lemv/Oligos/oligosMos1.file, checkorient=t, processors=16)
When I check the scrap codes generated, it seems like the reads are not aligned cause of missmatches in both barcodes and primers:
Ruk2_S2_L001_R1_001.scrap.contigs.fasta
M00485_502_000000000-CFK5Y_1_1113_15046_22480 | bf(bf) ee=1.21853 fbdiffs=1000(noMatch), rbdiffs=1000(noMatch) fpdiffs=16(noMatch), rpdiffs=1002(noMatch
I was considering using the strategy suggested in the following thread:
https://forum.mothur.org/t/all-sequences-in-scrap-after-make-contigs/20253
I don’t think the sequencing facility used linkers or adapters, so I dont know why some barcode-primer combinations are working (barcode # 50 consistently being the best). Could it be a problem with the quality of the reads and not a computational mistake on my side?
Many thanks for the help!
Luis