Consistent naming of output file names

Hello,

while working my way through the SOP I realized that there are
quite different ways of naming the output files for the different commands.

It would be great if the naming of the output files could be consistent
throughout the commands to be able to later on track back which files
belong together/ were used as input and output as there are a lot of them in the end… And of course to not
overwrite existing files which one might want to use later on.

As examples where I found some differences in the naming of the output files:

First of all in the unique.seqs command the input files are:
GQY1XT001.shhh.trim.fasta and
GQY1XT001.shhh.trim.names
and the output files are:
GQY1XT001.shhh.trim.unique.fasta and
GQY1XT001.shhh.trim.names

Here, the naming of the names file is dependent on the name of the
fasta file (which can be seen at a later application of the unique.seqs
command in the SOP). Even worse, the input names file is overwritten with the output names file!

Another example, the pre.cluster command uses the input files
GQY1XT001.shhh.trim.unique.good.filter.unique.fasta and
GQY1XT001.shhh.trim.unique.good.filter.names
and creates the output files:
GQY1XT001.shhh.trim.unique.good.filter.unique.precluster.fasta and
GQY1XT001.shhh.trim.unique.good.filter.unique.precluster.names

So apparently, the name of the names file is also adjusted to the name of the
input fasta file and the output files have the same name, which helps to later identify which files belong together.

And for the command remove.seqs, the input files are
GQY1XT001.shhh.trim.unique.good.filter.unique.precluster.fasta,
GQY1XT001.shhh.trim.unique.good.filter.unique.precluster.names and
GQY1XT001.shhh.good.groups
and the output files are:
GQY1XT001.shhh.trim.unique.good.filter.unique.precluster.pick.fasta,
GQY1XT001.shhh.trim.unique.good.filter.unique.precluster.pick.names and
GQY1XT001.shhh.good.pick.groups

Here, additional to the input name, .pick was added to create the new
names for the output files. This makes sense because one can track back
the changes from one step to the next.


Thanks for the nice tutorial and good explanations throughout the SOP!

Hi everyone,

It would be useful sometimes to be able to customize the output names. For example, if I extract sequences of the group “ill” with get.groups, I would like to name the output file automatically *.ill.fasta instead of *.pick.fasta.

Has anyone dealt with the same problem?

Cheers,
Fred

you can use the system command to rename files using the command line commands for your operating system

For mac/linux:

system(mv myfile.pick.fasta myfile.ill.fasta)