Allow special characters in file names OR quoting thereof


I’m using mothur in batch mode, and just found out that some commands (e.g. align.seqs) don’t like dashes in the file name. E.g:

mothur > align.seqs(fasta=Project123-456.trim.contigs.good.unique.fasta, reference=/Sequences/Silva/silva.bacteria/silva.bacteria.fasta)

crashes with

Unable to open Project123. Trying default /illumina/Projects/Project123

I can’t use " or ’ to quote the entire filename - but found out that I could escape the dash with . However, other commands/arguments (e.g. summary.seqs) wouldn’t accept escaped dashes (uses the raw string and complains there is no “Project123-456.trim.contigs.good.unique.fasta” file).

I couldn’t find any documentation on this - could you either tell me which special characters and which commands/arguments need escapes for special characters or could you introduce a generic scheme for quoting command arguments (then I could make sure that all filenames are always quoted)? I process some data where I would like to name the mothur files like the other project files (which sometimes contain dashes). Right now I have to find out which commands/arguments need the escapes by trial and error, which is rather time consuming… Or am I missing something obvious?


Same problem encountered this problem when using SSUsearch.

SSUsearch’s pipeline sends commands to mothur in the following manner

mothur “#classify.seqs(fasta=$Tag.ssu.out/$Tag.qc.$Gene.align.filter.fa, template=$Gene_db_cc, taxonomy=$Gene_tax_cc, cutoff=50, processors=$Cpu)”

does this mean we should escape all special characters or only the dash character?

however my $Tag bash variable has a dash character within it.

Guo, J., Cole, J. R., Zhang, Q., Brown, C. T., & Tiedje, J. M. (2016). Microbial Community Analysis with Ribosomal Gene Fragments from Shotgun Metagenomes, 82(1), 157–166.

Only the ones with dashes in them.