Thanks for all of the great work on mothur. Some of our data needs to be manually massaged before going into mothur and as such I’m creating my own groups file. A problem that I’ve run into is that there does not seem to be a way to update my groups file from within mothur when I run trim.seqs. As a workaround, I’m implementing my own trimming by length & homopolymer (and thereby skipping trim.seqs all together), but it would be very cool if trim.seqs could take a groups file as a parameter and update it on the fly creating a new *.trim.groups file. Seems like the only time a trim can create a groups file is when an oligos file is given as a parameter. Like others, I already have my reads parsed into individual sample files before going into a mothur pipeline.
Thanks for the suggestion, we will be adding this to mothur 1.15.0. In the meantime, you could use list.seqs to get a list of the sequences in the .trim.fasta file mothur creates and then use get.seqs to select those sequences from your groupfile.
Was this feature added? I don’t see it as available in when I run trim.seqs with either group or groups as a parameter
mothur > trim.seqs(fasta=hmp_it_ct_vt.fasta, maxambig=0, maxhomop=6, processors=8, group=hmp_it_ct_vt.group)
group is not a valid parameter.
The valid parameters are: fasta, oligos, qfile, name, flip, maxambig, maxhomop, minlength, maxlength, pdiffs, bdiffs, tdiffs, processors, allfiles, qtrim, qthreshold, qaverage, rollaverage, qwindowaverage, qstepsize, qwindowsize, keepfirst, removelast, inputdir, and outputdir.
[ERROR]: did not complete trim.seqs.
It was added in version 1.15 and 1.16. We removed it from the trim.seqs command because we now have other commands to parse by group.
I hope this helps,
Unless I’m missing something, this still doesn’t replicate the functionality I was talking about. Not that it has to, but just saying…
In any case, assume that I have existing fasta and groups files and I want to remove reads exceeding some homopolymer threshold. If I run trim, I’m able to get the fasta file fine (trim.fasta), but the groups file remains unchanged which I assume will cause problems downstream. Or, maybe it won’t matter to have more entries in the groups file than in the fasta file. I guess I’ve never tried it. Can you weigh in on that?
You can remove the trimmed sequences from your group file with the following:
trim.seqs(fasta=yourFastaFile, other parameters…)
If you want to have an individual fasta file for each group, then run the following as well:
I hope this helps,
Thanks. It also occurred to me that screen.seqs would take care of this as long group and name are passed after alignment.
For me screen.seqs din’t help, but list.seqs(name=current) and get.seqs(group=current) did the job.