split.groups

Fabian · May 15, 2013, 4:32am

Hi,

I would like to send my data to a repository. I figured the best format for my data set would be to have a fasta file for each tag/group in the groups file. Split.groups does this nicely. However, I would like to include the read quality data from the .qual files in my submission so that people who want to use the data can do the quality filtering they like.

My plan was to use trim.seqs with an oligos file but without any further filtering, then use split.groups to get a ‘raw’ data file for each tag/group. The problem is that I loose the quality data along the way. What I would need is to split the quality data from the .qual file into groups + the quality files should be trimmed from primer and tags as well so that they correspond to the fasta files.

Does this make sense? Is there a way to do it with MOTHUR? Would that be worth implementing?

Thank you and keep up the great work,
Fabian

westcott · May 15, 2013, 12:01pm

You could use the list.seqs and get.seqs commands to select the quality data for each group, a bit tedious but a solution.

list.seqs(fasta=fastaFileGroup1)
get.seqs(qfile=yourQualityFile, accnos=current)
//change file name so its not overwritten
list.seqs(fasta=fastaFileGroup2)
get.seqs(qfile=yourQualityFile, accnos=current)
…

pschloss · May 15, 2013, 12:14pm

You could also post your raw data and your work flow as an Example Analysis and then point readers to the wiki

Fabian · May 15, 2013, 6:00pm

Works! I was happy to see that trim.seqs outputs a trim.qual. I totally forgot that. So my .qual files are nicely trimmed. . I can not provide the raw data because some of the tags in that lane belong to other people’s projects. I could try and make a fake raw data file by first extracting my sequences with trim.seqs and get.groups and then using list.seqs on my sequences and get.seqs on the raw data. Does this make sense?

Thanks,
Fabian

pschloss · May 15, 2013, 7:04pm

Alternatively, if you make an oligos file with the barcodes labelled for group “ignore” those samples will go away. Without the metadata, the sequences would be pretty worthless to anyone else.

Topic		Replies	Views
Help please with a command o mothur Theory behind mothur	8	590	March 7, 2021
trim.seqs with existing groups file Feature requests	7	10225	December 20, 2011
Trim seqs command issues Commands in mothur	6	4870	October 4, 2011
Trim.Seqs (Output blank) Commands in mothur	3	618	March 1, 2019
fastq, fasta and qual Commands in mothur	2	781	December 15, 2017

split.groups

Related topics