shhh.flows() in a parallel for loop

Hello,

I’m constructing a little pipeline that combines bits of QIIME and Mothur, coming from trim.flows()…

source /macqiime/configs/bash_profile.txt
THEFILES=/Users/me/Documents/sffs/*.files
for f in $THEFILES
do
mothur "#shhh.flows(file=$f)" &
nrwait 8 #work with later
done
wait

function nrwait() {
    local nrwait_my_arg
    if [[ -z $1 ]] ; then
 nrwait_my_arg=2
    else
 nrwait_my_arg=$1
    fi
    
    while [[ $(jobs -p | wc -l) -ge $nrwait_my_arg ]] ; do
 sleep 0.33;
    done
}

Basically, this little script (that is called from the main script) calls and runs n (upto 8 here) shhh.flows() in parallel.


Every time it finished processing a file..
Finalizing...
[ERROR]: Could not open /Users/me/Documents/sffs//Users/me/Documents/sffs/muhfile.shhh.fasta
[ERROR]: Could not open /Users/me/Documents/sffs//Users/me/Documents/sffs/muhfile.shhh.names
Total time to process /Users/me/Documents/sffs/muhfile.muhfile.flow: 5842 5837.61

Output File Names: 
/Users/me/Documents/sffs/muhfile.muhfile.shhh.qual
/Users/me/Documents/sffs/muhfile.muhfile.shhh.fasta
/Users/me/Documents/sffs/muhfile.muhfile.shhh.names
/Users/me/Documents/sffs/muhfile.muhfile.shhh.counts
/Users/me/Documents/sffs/muhfile.muhfile.shhh.groups
/Users/me/Documents/sffs//Users/me/Documents/sffs/muhfile.shhh.fasta
/Users/me/Documents/sffs//Users/me/Documents/sffs/muhfile.shhh.names

Is it safe to ignore these error messages? I mean, are muhfile.muhfile.shhh.fasta and muhfile.muhfile.shhh.names the same files as muhfile.shhh.fasta and muhfile.shhh.names that it’s (for whatever reason) trying (and failing) to create to the clearly wrong path?

Does it look like the correct files are getting generated? Also, if you use processors=8 you’ll get what you’re trying to do… It will distribute the groups to separate processors and run through them in parallel.

/Users/me/Documents/sffs/muhfile.muhfile.shhh.qual
/Users/me/Documents/sffs/muhfile.muhfile.shhh.fasta
/Users/me/Documents/sffs/muhfile.muhfile.shhh.names
/Users/me/Documents/sffs/muhfile.muhfile.shhh.counts
/Users/me/Documents/sffs/muhfile.muhfile.shhh.groups

Are generated successfully, however muhfile.shhh.fasta and muhfile.shhh.names are not. I attempted processors=8 before, but this only lead to sequential single core processing of the files (might be related to the fact that I’m calling it from a script within a script or perhaps to the fact that this version of mothur is from the latest MacQiime).

edit. Well anyway my parallel implementation isn’t so perfect because I for example lose the logs, so perhaps I’ll attempt this with e.g. gnu parallel and see if that works better…