How do I add group info to .count_table

In v.47, unique.seqs no longer outputs a .names file as a default but instead a count_table. I can’t find a way to add group information to .count_table. Both commands:
mothur > count.seqs(count=BacPop10.chop.count_table, group=BacPop10.groups)
and
mothur > count.seqs(count=BacPop10.chop.count_table, group=BacPop10.groups, compress=f)
create an identical table without a group column.
Both output files, BacPop10.chop.sparse.count_table and BacPop10.chop.full.count_table, look identical to me and consist of 2 columns, the first with the sequence names and the second being the total count.
In the above examples, the group file was created with make.groups

thanks for your help,

Giovanni Widmer

you might need to post the group file to see if there is any problem…

You can indicate the format you would like outputted. Try this:

mothur > unique.seqs(fasta=yourFastaFile, output=name) - deconvolute fasta file creating a fasta and name file

Thanks for the reply. Do you mean output=name or format=name? I don’t see “output” as an option for unique.seqs

Long time lurker first time poster.

I’m running into a similar issue. I’m analyzing some PacBio full-length data and the .count_table does not seem to be populating with group information. I do have group information until the initial formation of the count_table, then it seems to go sideways. So, considering I’m able to form a group table, I’m presuming the information is there, right? We’re also interested in holding onto group information throughout so we can determine where reads are being lost.

I tried Sarah’s suggestion above with this to try to resolve it early in the process:
mothur fastq.info(file=fixrun.files, checkorient=T, pacbio=T, qfile=F)
mothur count.groups(group=fixrun.group) - have group file populated
mothur screen.seqs(fasta=fixrun.fasta, group=fixrun.group, maxambig=0, minlength=1478, maxlength=1559, maxhomop=8) - still looks good
mothur count.groups(group=fixrun.good.group) -have groups populated
mothur unique.seqs(fasta=fixrun.good.fasta, name=fixrun.good.group, output=count)
mothur count.groups(count=fixrun.good.unique.count_table)
[ERROR]: Your count file does not have any group information, aborting.
[ERROR]: did not complete count.groups.

Any insight would be welcome. Everything else seems to work well, but I just can’t seem to find where the issue lies where group info is being disconnected from the rest of the data.

edit
I ended up figuring out a workaround using v1.46.1 using the following. Wouldn’t work in 1.47 due to the lack of a name file being produced. I’d still be interested in suggestions moving forward if the names and groups files are going to be removed from new versions.

fastq.info(file=fixrun.files, checkorient=T, pacbio=T, qfile=F)
summary.seqs(fasta=current)
count.groups(group=fixrun.group)
screen.seqs(fasta=current, maxambig=0, maxhomop=6, bdiffs=0, pdiffs=0, minlength=1478, maxlength=1559, maxhomop=8)
summary.seqs(fasta=current)
unique.seqs(fasta=current)
count.seqs(name=current,group=fixrun.group)
count.groups(count=current)

@gwidmer Both format and output work as parameters for unique.seqs, :slight_smile: .

@cjm We have decided to move away from the name / group file combo to the count file. The count file contains the same information as the name and group file, use less memory and time to process commands, and by using 1 file instead of 2 we can reduce user file name mismatches. To this end, v1.47.0 includes several changes to command file outputs. From your commands I suspect you are creating the group file with the make.group command. The make.group command currently outputs a group file. I will change this in our next release to output a count file with the option to choose the format of count or group. You can follow the progress here, Add format option to make.group · Issue #815 · mothur/mothur · GitHub.

@wescott Thanks for this. The workaround I described was because there doesn’t seem a way of populating sample info into a count_table from fastq.info() which seems like the best way of reading pacbio data into mothur. Swapping fastq.info out for the make.contigs function and going through most of the MiSeq SOP didn’t retain sample info and just binned everything into one sample by the time it got to creating the count_table. The group.file comes from fastq.info, which I used to make the count_table and proceed. Maybe I’m missing something. I’m prone to overlook the obvious (like removing the bcs and primers in the code I initially described above :slight_smile:)

1 Like

Ahh, I see. I will add an update for the fastq.info command as well to output a count file instead of a group file by default. Change output from group to count in fastq.info · Issue #816 · mothur/mothur · GitHub

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.