getting a groups file for PacBio reads

Hello,
I think this is a pretty basic question but I am just curious if there is a way to get a groups file for “single end” PacBio ccs data. For paired end I know we just use the make.contigs command but since I only have one “end” per sample that command does not work. I know I can concatenate the fasta (or fastq) files with the merge.files command but I don’t know how to preserve the group information. Again I am probably missing something basic or maybe I will have to run each file individually. Thank you in advance for your help!

Thanks

Bob Nichols

Hi Bob,

I don´t know if this helps with your PacBio files but I recently needed to create a groups file for multiple samples where I started with fastq files (instead of the raw forward & reverse reads) that I wanted to combine in one analysis and where I could not use the make.contigs command. I used these commands:

mothur> fastq.info(fastq=SampleName.fastq)

  • Creates fasta & qual files from the fastq file, done for each sample fastq file separately.

mothur > make.group(fasta=SampleName1.fasta-SampleName2.fasta-SampleName3.fasta-SampleName4.fasta-SampleName5.fasta, groups=SampleName1-SampleName2-SampleName3-SampleName4-SampleName5)

  • Assigns each sequence to its sample. Creates a file called “mergegroups” which I changed to “Sample.groups”.

mothur> merge.files(input= SampleName1.fasta-SampleName2.fasta-SampleName3.fastaSampleName4.fasta-SampleName5.fasta, output=Sample.fasta)

  • I had then my combined fasta file and its corresponding groups file and moved happily on to further steps in my MiSeq SOP. :slight_smile:

Hope this helps & good luck with your analysis,

René

Hey René

Thank you so much for the help! This worked perfectly!

Thanks again,

Bob