update: fixed.how to make a .groups file

ch3coch3 · April 13, 2016, 11:09pm

Hi all,

Before I ask the question, a brief introduction of the situation: we have the 16s Miseq sequencing data available in the forms of [xx.forward.fastq, xx.reverse.fastq] (raw data), xx.full.fasta(the company merged the raw data in a file for us) and xx.pr.fasta(they make contigs and trimmed for us).

Normally we make contigs with the fastq files and use oligo file and xx.contigs.fasta files to run trim.seqs and get the trimmed sequences and groups file. Then everyone is happy. But the problem here is when we do that, we loss half of the reads and the downstream data mining is horrible.

So we have to use their xx.pr.fasta file which is decent in sequence quality and # of reads. But the obstacle here is that we can’t get .groups file from the xx.pr.fasta file. Or to say that we don’t know how yet. It’s my understanding that xx.pr.fasta is equal to stability.trim.contigs.good.fasta file in the SOP. The .groups file was generated after make.contigs. Or in our previous case the .groups file was made after trim.seqs with the .oligos file. But with this trimmed data set, how do we make a .groups file please?

To summarize, I have a trimmed file xx.pr.fasta, which was assembled and trimmed and equal to stability.trim.contigs.good.fasta and ready for unique.seqs command. How to I make a .groups file so that I can follow SOP to do data mining, please? Thanks much!!

The xx.pr.fasta data looks like this

CA5V::D00420:88:H5NGNBCXX:2:2206:14967:51888 1:N:0:4
TACGGAGGGTGCAAGCGTTATCCGGATTCACTGGGTTTAAAGGGTGCGTAGGCGGGTATGTAAGTCAGTGGTGAAATACCGGAGCTTAACTTCGGAACTGCCATTGATACTATATACCTTGAATATTGTGGAGGTAAGCGGAATATGTCATGTAGCGGTGAAATGCTTAGAGATGACATAGAACACCGATTGCGAAGGCAGCTTACTACGCAAATATTGACGCTGAGGCACGAAAGCGTGGGGATCAAACAGGATTAGATACCCGCGTAGTCC
CA1V::D00420:88:H5NGNBCXX:2:2205:3074:49079 1:N:0:4
TACGTAGGGTCCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGTGCGCAGGCGGTTGTGCAAGACCGATGTGAAATCCCCGAGCTTAACTTGGGAATTGCATTGGTGACTGCACGGCTAGAGTGTGTCAGAGGGGGGTAGAATTCCACGTGTAGCAGTGAAATGCGTAGAGATGTGGAGGAATACCGATGGCGAAGGCAGCCCCCTGGGATAACACTGACGCTCATGCACGAAAGCGTGGGGAGCAAACAGGATTAGATACCCCGGTAGTCC
CA13E::D00420:88:H5NGNBCXX:2:2114:9286:22238 1:N:0:4
TACGTAGGTGGCGAGCGTTGTCCGGATTTACTGGGCGTAAAGGGAGCGTAGGCGGATTTTTAAGTGAGATGTGAAATACTCGGGCTTAACCTGAGTGCTGCATTTCAAACTGGAAGTCTAGAGTGCAGGAGAGGAGAAGGGAATTCCTAGTGTAGCGGTGAAATGCGTAGAGATTAGGAAGAACACCAGTGGCGAAGGCGCTTCTCTGGACTGTAACTGACGCTGAGGCTCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCCGGTAGTC
CA19V::D00420:88:H5NGNBCXX:2:2107:12094:71921 1:N:0:4
TACATAGGTTGCAAGCGTTATCCGGAATTATTGGGCGTAAAGCGTCTGTAGGTTGTATGTTAAGTCTGGCGTGAAAACTTGGGGCTCAACCCCAAATTGCGTTGGATACTGGCATACTAGTATTGTGTAGAGGTTAGCGGAATTCCTAGCGAAGCGGTGAAATGCGTAGATATTAGGAAGAACATCAACATGGCGAAGGCAGCTAACTGGGCACATATTGACACTGAGAGACGAAAGCGTGGGGAGCAAATAGGATTAGATACCCGTGTAGTCC

ch3coch3 · April 14, 2016, 5:17pm

update: problem solved. I used list.seqs command along with some other excel tricks.

Topic		Replies	Views
Group file if already start with contigs Commands in mothur	5	4370	August 30, 2019
Groups file - trivial question Commands in mothur	2	2289	March 22, 2013
Help while running get.groups command Commands in mothur	4	2070	January 5, 2016
trim.seqs with existing groups file Feature requests	7	10227	December 20, 2011
HELP! How to generate a '.group' file? Commands in mothur	2	2374	June 12, 2014

update: fixed.how to make a .groups file

Related topics