trim.seqs with existing groups file

Hello,

Thanks for all of the great work on mothur. Some of our data needs to be manually massaged before going into mothur and as such I’m creating my own groups file. A problem that I’ve run into is that there does not seem to be a way to update my groups file from within mothur when I run trim.seqs. As a workaround, I’m implementing my own trimming by length & homopolymer (and thereby skipping trim.seqs all together), but it would be very cool if trim.seqs could take a groups file as a parameter and update it on the fly creating a new *.trim.groups file. Seems like the only time a trim can create a groups file is when an oligos file is given as a parameter. Like others, I already have my reads parsed into individual sample files before going into a mothur pipeline.

Thanks,
Chris

Thanks for the suggestion, we will be adding this to mothur 1.15.0. In the meantime, you could use list.seqs to get a list of the sequences in the .trim.fasta file mothur creates and then use get.seqs to select those sequences from your groupfile.

Was this feature added? I don’t see it as available in when I run trim.seqs with either group or groups as a parameter

mothur > trim.seqs(fasta=hmp_it_ct_vt.fasta, maxambig=0, maxhomop=6, processors=8, group=hmp_it_ct_vt.group)

group is not a valid parameter.
The valid parameters are: fasta, oligos, qfile, name, flip, maxambig, maxhomop, minlength, maxlength, pdiffs, bdiffs, tdiffs, processors, allfiles, qtrim, qthreshold, qaverage, rollaverage, qwindowaverage, qstepsize, qwindowsize, keepfirst, removelast, inputdir, and outputdir.
[ERROR]: did not complete trim.seqs.

Thanks,
Chris

It was added in version 1.15 and 1.16. We removed it from the trim.seqs command because we now have other commands to parse by group.

http://www.mothur.org/wiki/Split.groups
http://www.mothur.org/wiki/Get.groups
http://www.mothur.org/wiki/Remove.groups

I hope this helps,
Sarah

Hi Sarah,

Unless I’m missing something, this still doesn’t replicate the functionality I was talking about. Not that it has to, but just saying…

In any case, assume that I have existing fasta and groups files and I want to remove reads exceeding some homopolymer threshold. If I run trim, I’m able to get the fasta file fine (trim.fasta), but the groups file remains unchanged which I assume will cause problems downstream. Or, maybe it won’t matter to have more entries in the groups file than in the fasta file. I guess I’ve never tried it. Can you weigh in on that?

Thanks,
Chris

You can remove the trimmed sequences from your group file with the following:

trim.seqs(fasta=yourFastaFile, other parameters…)
list.seqs(fasta=current)
get.seqs(accnos=current, group=yourGroupFile)

If you want to have an individual fasta file for each group, then run the following as well:

split.groups(fasta=current, group=current)

I hope this helps,
Sarah

Thanks. It also occurred to me that screen.seqs would take care of this as long group and name are passed after alignment.

Chris

Hi

For me screen.seqs din’t help, but list.seqs(name=current) and get.seqs(group=current) did the job.