MiSeq SOP using multiplexed R1.fastq and R2.fastq

Hi, the sequencing facility send us the files R1.fastq and R2.fastq (without demultiplexing it) and a map file (.txt) with sample names and bar-codes descriptions. How can we start the Mothur MiSeq SOP using these files? Is there some command as make.file or how should we demultiplex the files to use it in Mothur? Thanks!!

You can do this in make.contigs if you give it an oligos file. Give that a shot and let us know how you fare.

Pat

Thank you so much for the answer! We tried to do that, but it did not work, since the files were previously processed by the sequencing facility. It was likely previously demultiplexed, but merged in single files per user (R1.fastq and R2.fastq).

The sequences look like:
[the sample names and barcode sequences are in the readers, e.g. samples 1-96, i.e. @28 or @51; and barcodes e.g. orig_bc=AACTAGTTCAGG]:

@28_1 M01511:118:000000000-AN3CW:1:1101:14207:1449 1:N:0:0 orig_bc=AACTAGTTCAGG new_bc=AACTAGTTCAGG bc_diffs=0
TTCACCCCCCACCCTTTCCCCGCCCCGCCTCACTTACCGGCCAGGAAGCCCCCCTCCCCCCCGGTGGTCCTCCCGATATCCACCCAATTCACCCCTACCCCTGGAATTCCCCTTCCCTCCCCTCCACTCCAGCCAGGCCAGTTTGCAATGCAGTTCTCGGGTTGAGCCCGAAGATTTCACATCACACTTAACCAGCCGCCTACACGCGCTTTACGCCCAGTAATTCTGGATAACGCTAGCCCCCTACGTA
+
/–;----99-9----9;-9—9;—///9/A@99-FA9;//–999;–9-9–>@;9-9;—A<@9=;-9;//-9…;/0;00;=.EC.;.9;./GGC;0;…:C.C.A<<<.–<>1.0F>0//?0??10/HGDCF?11?2?1FFB//>/FFFFGGGECF2F2F2FFFGGB21@22BBF@F0A>>EEGHHHFEEAGEE1GGGEEGHCHFHHFHFAGF2BAF0EEE1AFCDDAAAFFAAAAA
@51_2 M01511:118:000000000-AN3CW:1:1101:14625:1451 1:N:0:0 orig_bc=TATCGACACAAG new_bc=TATCGACACAAG bc_diffs=0
CGTGTGCGCCCCTAGACTTCGTGCCACAGCGTCAGGAACGGTCCAGAGACCCGCCCTCGCCACTGGTCTTCCTTACGATAGCTACGCATTTCACCGCTACACCGTGAATTCCAGGCACCTCGCCAGTCCTCAAGCACGGCAGTATCGAATGCAGTCTCGGAGTTAAGCCACGAGATTTCACACCCGACTTACCGCGCCGCCTACGCACCCTTTACGCCCAATGAATCCGAACAACGCTTGAGACCTCTGTA

How could we apply the “make.contigs” so we can have the “Group count”; or how could we produce the “group file” to “screen.seqs”? Thank you again!

They gave you one R1 and one R2 but no I? Ask for R1 and R2 for each sample or for the I(s). Otherwise you could make a group file from the sequence names with (g)awk using both “_” and " " as field separators. I have to look at the gawk manual every time I need to do this.

It will be difficult to ask the facility for a different output, but I will try your idea (using “_” and " " to make the group file). Thanks @kmitchell!

you should send your samples to a different facility (like mine!) :lol:

haha what is the website of the facility? special discount for the forum users? :lol: