Dashes in filenames?

charliep · September 17, 2013, 8:22am

We’re encountering a problem with batch files that use - in the file names, it looks like some mothur commands can handle those and others can’t. The output below shows align.seqs() chopping Field-Two.trim.contigs.good.unique.fasta into Field and Two.trim.contigs.good.unique.fasta but summary.seqs() reads it fine. Should we just never use - in file names? Should we fix the code? thanks, charlie

…
Using 6 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 151 151 0 3 1
2.5%-tile: 1 252 252 0 4 326065
25%-tile: 1 253 253 0 4 3260644
Median: 1 253 253 0 5 6521287
75%-tile: 1 253 253 0 5 9781930
97.5%-tile: 1 254 254 0 8 12716508
Maximum: 1 275 275 0 80 13042572
Mean: 1 253.029 253.029 0 4.78814

of unique seqs: 4912829

total # of seqs: 13042572

Output File Names:
Field-Two.trim.contigs.good.unique.summary

mothur > system(/bin/echo "starting align.seqs date" >>/dev/stderr)

mothur > align.seqs(fasta=current, reference=silva.v4.fasta)
Using Field-Two.trim.contigs.good.unique.fasta as input file for the fasta parameter.
Unable to open Field. It will be disregarded.
Unable to open Two.trim.contigs.good.unique.fasta. It will be disregarded.
no valid files.

Using 6 processors.
[ERROR]: did not complete align.seqs.

mothur > system(/bin/echo "starting summary.seqs date" >>/dev/stderr)

mothur > summary.seqs(fasta=current, count=current)
Using Field-Two.trim.contigs.good.count_table as input file for the count parameter.
Using Field-Two.trim.contigs.good.unique.fasta as input file for the fasta parameter.
…

westcott · September 17, 2013, 12:30pm

Some of mothur’s commands allow you to enter multiple files separated by dashes. The align.seqs command is one of them, align.seqs(fasta=final.fasta-final2.fasta, reference=silva.bacteria.fasta). You can escape the dashes in filenames to indicate to mothur that the dash is part of the filename, align.seqs(fasta=Field-Two.trim.contigs.good.unique.fasta, reference=silva.bacteria.fasta).

Ghazal · July 23, 2014, 2:17pm

Hi
I would like to use “venn command”. But my samples have dash (-) inside their names! (e.g. D1298-1, D1298-2, D1298-3, …), so I can not run this command. I used double quotations to make them one word as well, but the software still seperates them by dash and reads them as “D1298 and 1”! and considers them as unvalid names. What shall I do? Is there any trick that I can pass this step or I have to change my samples’ names and start from the scratch? (I don’t want to use “permute=T”)
Thanks in advance

pschloss · July 24, 2014, 3:37pm

You might try using merge.groups where the first column is the group name with the dashes and the second column is the group name without the dashes. Alternatively, you could always open the shared file in a text editor and replace all of the dashes with periods (or something else).

Ghazal · July 30, 2014, 1:03pm

Thanks a lot!
It was really helpful!

Topic		Replies	Views
Align.seqs input file mothur bugs	1	1423	March 10, 2016
"Summary.seqs" went wrong after "trim.seqs" Commands in mothur	1	2232	January 4, 2013
Allow special characters in file names OR quoting thereof Feature requests	2	7723	August 29, 2016
align.seqs question Commands in mothur	3	2421	December 4, 2014
trim.seq error mothur bugs	2	1833	July 29, 2015

Dashes in filenames?

of unique seqs: 4912829

Related topics