filenames, not what they should be? and a group problem

Laura_Kelly · October 6, 2016, 3:43pm

Hi All
A statement and related questions relating to my use of the 454 SOP with v 1.37

Firstly, when I run remove.seqs the names file I get ends in precluster.pick.names. Two steps later, the remove.lineage name output has an identical file name. I gather from the online SOP info (and it seems very logical) that the latter should end in precluster.pick.pick.names, but it is missing a ‘pick’. I don’t know why this is and yes, I’ve tried it a few times. Anyone else find this?

Regarding groups, I have a specific question regarding the final.groups file mentioned in the 454 SOP. I realise in that instance that final.groups was ultimately derived (renamed) from the pick.pick.pick.groups file, but my question is at what step was that file generated please? I haven’t used a mock community in my sequencing so have obviously omitted that step of the 454 SOP. The last groups file generated therefore was a result of the remove.seqs command, ending with good.pick.groups. When I use this however in the make.shared command, mothur doesn’t like it; it appears to be missing a lot of sequences compared to the list file.

Have I gone wrong somewhere? Thanks in advance for any suggestions

pschloss · October 12, 2016, 12:34pm

Firstly, when I run remove.seqs the names file I get ends in precluster.pick.names. Two steps later, the remove.lineage name output has an identical file name. I gather from the online SOP info (and it seems very logical) that the latter should end in precluster.pick.pick.names, but it is missing a ‘pick’. I don’t know why this is and yes, I’ve tried it a few times. Anyone else find this?

Can you post the commands you are entering with the files being generated?

Regarding groups, I have a specific question regarding the final.groups file mentioned in the 454 SOP. I realise in that instance that final.groups was ultimately derived (renamed) from the pick.pick.pick.groups file, but my question is at what step was that file generated please? I haven’t used a mock community in my sequencing so have obviously omitted that step of the 454 SOP. The last groups file generated therefore was a result of the remove.seqs command, ending with good.pick.groups. When I use this however in the make.shared command, mothur doesn’t like it; it appears to be missing a lot of sequences compared to the list file.

In your case, it probably would have been the file generated after remove.lineage.

Pat

Laura_Kelly · October 17, 2016, 11:43am

Hi Pat, thanks for your reply. Here are the commands and the resulting files, beginning from unique.seqs. You’ll note that I don’t get any new group file generated with remove.lineage either, hence my confusion.

unique.seqs(fasta=Kelly3832B.shhh.trim.unique.good.filter.fasta, name=Kelly3832B.shhh.trim.unique.good.names)
Outputs: Kelly3832B.shhh.trim.unique.good.filter.names
Kelly3832B.shhh.trim.unique.good.filter.unique.fasta

precluster(fasta=Kelly3832B.shhh.trim.unique.good.filter.unique.fasta, name=Kelly3832B.shhh.trim.unique.good.filter.names, group=Kelly3832B.shhh.good.groups, diffs=2)
Outputs: Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.unique.names
Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.unique.fasta
Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.names
Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.fasta
& for each sample Kelly3832B.shhh.trim.unique.good.filter.precluster.sample.map

chimera.uchime(fasta=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.fasta, name=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.names, group=Kelly3832B.shhh.good.groups)
Outputs: Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.denovo.uchime.chimera
Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.denovo.uchime.accnos

remove.seqs(accnos=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.denovo.uchime.accnos, fasta=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.fasta, name=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.names, group=Kelly3832B.shhh.good.groups, dups=T)
Outputs: Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.names
Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.fasta
Kelly3832B.shhh.good.pick.groups

classify.seqs(fasta=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.fasta, name=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.names, group=Kelly3832B.shhh.good.pick.groups, template=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax, cutoff=80)
Outputs: Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pds.wang.taxonomy
Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pds.wang.tax.summary

remove.lineage(fasta=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.fasta, name=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.names, taxonomy=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pds.wang.taxonomy, taxon=Mitochondria-Chloroplast-Archaea-Eukaryota-unknown)
Outputs: Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy
Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.names
Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pick.fasta

dist.seqs(fasta=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pick.fasta, cutoff=0.15)
Output: Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pick.dist

cluster(column=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pick.dist, name=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.names)
[Note: I realise the name file should end in pick.pick.names, however as the previous names file only had one pick, only used one here as pick.pick.names does not exist]
Outputs: Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pick.an.sabund
Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pick.an.rabund
Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pick.an.list

Can’t progress further as groups files generated to date not appropriate to use with the list file…

pschloss · October 17, 2016, 12:59pm

>remove.lineage(fasta=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.fasta, name=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.names, taxonomy=Kelly3832B.shhh.trim.unique.good.filter.unique.precluster.pick.pds.wang.taxonomy, taxon=Mitochondria-Chloroplast-Archaea-Eukaryota-unknown)

You didn’t include your group file in remove.lineage.

Also, you're running chimera.uchime the way we have it described for 454 data. We've shifted our thinking and think it's probably wisest, to do what we do with the MiSeq data:

mothur > chimera.uchime(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table, dereplicate=t)
mothur > remove.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.accnos)

Laura_Kelly · October 17, 2016, 2:18pm

Fab, thanks Pat, don’t know how I missed the ‘groups’ in remove.lineage. That should work now. As for the chimera.uchime, I gather you thus mean running with a fasta and a count file instead of the 454 way, generating the count file by running. count.seq using the current name and group file? Is there a particular reason why this would be better? Out of curiosity I’ll run both later and see what difference it makes.

Thanks again for your keen eye.

Best,
Laura

pschloss · November 1, 2016, 5:40pm

The groups/name and count table approaches should give the same/similar results. Using a count table should use less memory and be easier to keep track of thigns.

Pat

Topic		Replies	Views
Remove.lineage files not in synch (tax,group) mothur bugs	10	16707	January 23, 2012
Name file and group file sequence discrepancy Commands in mothur	5	3867	May 29, 2013
Remove.lineage: accnos file missing Commands in mothur	9	1850	November 3, 2019
Losing sequences from names file with remove.groups Commands in mothur	4	3397	May 1, 2012
remove.lineage output files not in sync Commands in mothur	5	4195	January 28, 2016

filenames, not what they should be? and a group problem

Related topics