Dashes in filenames?

We’re encountering a problem with batch files that use - in the file names, it looks like some mothur commands can handle those and others can’t. The output below shows align.seqs() chopping Field-Two.trim.contigs.good.unique.fasta into Field and Two.trim.contigs.good.unique.fasta but summary.seqs() reads it fine. Should we just never use - in file names? Should we fix the code? thanks, charlie


Using 6 processors.

Start End NBases Ambigs Polymer NumSeqs
Minimum: 1 151 151 0 3 1
2.5%-tile: 1 252 252 0 4 326065
25%-tile: 1 253 253 0 4 3260644
Median: 1 253 253 0 5 6521287
75%-tile: 1 253 253 0 5 9781930
97.5%-tile: 1 254 254 0 8 12716508
Maximum: 1 275 275 0 80 13042572
Mean: 1 253.029 253.029 0 4.78814

of unique seqs: 4912829

total # of seqs: 13042572

Output File Names:
Field-Two.trim.contigs.good.unique.summary

mothur > system(/bin/echo "starting align.seqs date" >>/dev/stderr)

mothur > align.seqs(fasta=current, reference=silva.v4.fasta)
Using Field-Two.trim.contigs.good.unique.fasta as input file for the fasta parameter.
Unable to open Field. It will be disregarded.
Unable to open Two.trim.contigs.good.unique.fasta. It will be disregarded.
no valid files.

Using 6 processors.
[ERROR]: did not complete align.seqs.

mothur > system(/bin/echo "starting summary.seqs date" >>/dev/stderr)

mothur > summary.seqs(fasta=current, count=current)
Using Field-Two.trim.contigs.good.count_table as input file for the count parameter.
Using Field-Two.trim.contigs.good.unique.fasta as input file for the fasta parameter.

Some of mothur’s commands allow you to enter multiple files separated by dashes. The align.seqs command is one of them, align.seqs(fasta=final.fasta-final2.fasta, reference=silva.bacteria.fasta). You can escape the dashes in filenames to indicate to mothur that the dash is part of the filename, align.seqs(fasta=Field-Two.trim.contigs.good.unique.fasta, reference=silva.bacteria.fasta).

Hi
I would like to use “venn command”. But my samples have dash (-) inside their names! (e.g. D1298-1, D1298-2, D1298-3, …), so I can not run this command. I used double quotations to make them one word as well, but the software still seperates them by dash and reads them as “D1298 and 1”! and considers them as unvalid names. What shall I do? Is there any trick that I can pass this step or I have to change my samples’ names and start from the scratch? (I don’t want to use “permute=T”)
Thanks in advance

You might try using merge.groups where the first column is the group name with the dashes and the second column is the group name without the dashes. Alternatively, you could always open the shared file in a text editor and replace all of the dashes with periods (or something else).

Thanks a lot!
It was really helpful! :slight_smile: