0

EDIT : I simplify my question here : BASH local and flock

I managed to make a manager for launching background process. The aim is to :

  • start the manager and launch only nb_min (example 1) child process ( a python file in this example).
  • if the child end with a 0 errorcode, then step up to nb_max (example 10) child processes
  • else relaunch only nb_min (1) process

so I will have only between nb_min and nb_max child processes launched simultanously

Now, I would like to add an id slot for each launch to make sure that I have only 1 call running with the same id_slot ... I don't get how to manage this !

here is the manager.sh code :

#!/bin/bash

if [ "$USER" != "root" ]
then
        echo "erreur"
        echo "il faut lancer ce script en root"
        echo "sudo $0 nb_jobs_depart nb_jobs_max command"
        exit 1
fi

if [[ $# < 3 ]]
then
        echo "erreur"
        echo "il faut au moins 3 parametres"
        echo "sudo $0 nb_jobs_depart nb_jobs_max command"
        exit 1
fi


min_jobs=$1
shift
max_jobs=$1
shift
command=$@

#n_job=$min_job
echo $min_jobs>/dev/shm/n_job

function _job_worker()
{
        local command=$@
        local result=

        echo "lancement de $command"

        #{ $command ; } >>./_logme.log 2>&1
        $command >>./_logme.log 2>&1
        local result=$?
        echo "retour de command : $result"
        if [[ "${result}" == "0" ]]; then
                echo $max_jobs>/dev/shm/n_job
                #n_job=$max_jobs
                echo "passage a $max_jobs jobs"
        else
                echo $min_jobs>/dev/shm/n_job
                #n_job=$min_jobs
                echo "passage a $min_jobs jobs"
        fi
}

while true; do
        n_job=$(</dev/shm/n_job)
        current_jobs=$(jobs -pr)
        x=0;
        for job in $current_jobs; do
                (( x++ ))
        done;

        echo "il y a $x jobs en cours, il en faut $n_job"
        jobs_to_run=$(($n_job - $x))
        if (( jobs_to_run > 0 )); then
                echo "il y a $jobs_to_run a relancer"
                for (( y = 0; y < $jobs_to_run; y++ )); do
                        _job_worker $command &
                done
        fi
        x=0
        echo "petite pause"
        jobs
        sleep 1
done

here is the python code used for the test called with 2 params :

  1. pause in seconds
  2. exit code

example : ./sleeper.pl 10 0 launches a pause of 10s and return an errorcode of 0

#!/usr/bin/perl -w

use strict;

my $time = $ARGV[0] || 1;
my $exit = $ARGV[1] || 0;

print "bonjour $time $exit \n";

sleep $time;
exit $exit;

and now a code sample :

./manager.sh 1 10 ./sleeper.pl 10 0
  • call with minimum 1 process and maximum 10 process if errorcode == 0
  • each process call sleeper.pl with a 10s pause and a return code of 0

EDIT : 16h30

I tried to manage something around the share array advice from here : bash background process modify global variable

but then got bumped by the lock matter, witch I tried to override with a flock but then I loosed the z local variable between the lock and after ...

This new version try to use the pid of each child ... but now I got always the same PID ...????....

#!/bin/bash

if [ "$USER" != "root" ]
then
        echo "erreur"
        echo "il faut lancer ce script en root"
        echo "sudo $0 nb_jobs_depart nb_jobs_max command"
        exit 1
fi

if [[ $# < 3 ]]
then
        echo "erreur"
        echo "il faut au moins 3 parametres"
        echo "sudo $0 nb_jobs_depart nb_jobs_max command"
        exit 1
fi


min_jobs=$1
shift
max_jobs=$1
shift
command=$@

#n_job=$min_job
echo $min_jobs>/dev/shm/n_job

declare -A SLOTS
for (( y = 1; y <= $min_jobs; y++ )); do
    SLOTS[$y]="vide"
done
set | grep ^SLOTS= >/dev/shm/SLOTS

declare -A PIDS

exists(){
  if [ "$2" != in ]; then
    echo "Incorrect usage."
    echo "Correct usage: exists {key} in {array}"
    return
  fi   
  eval '[ ${'$3'[$1]+muahaha} ]'  
}


function _job_worker()
{
    local command=$@
    local z=1
    local result=

    # je dois trouver un slot dispo
    echo "avant le lock mon pid vaut $$"
    (
        # Wait for lock on /var/lock/.manager.exclusivelock (fd 200)
        flock -x -w 10 200 || return

        echo "apres le lock mon pid vaut $$"

        declare -A SLOTS;
        . /dev/shm/SLOTS

        echo "DEBUG etat des slots AVANT process"
        dumpSLOTS

        for i in "${!SLOTS[@]}"
        do
            if [[ "${SLOTS[$i]}" != "vide" ]]; 
            then
                z=$(( $z + 1 )) 
                echo "z vaut $z"
                if [[ $z -gt $n_job ]]; then
                    echo "ERREUR tous les slots sont plein ...."
                    exit 9
                fi
            fi
        done

        echo "je remplie le slot $z avec le pid $$"
        SLOTS[$z]=$$
        PIDS[$$]=$z
        # je sauve
        set | grep ^SLOTS= >/dev/shm/SLOTS
    ) 200>/var/lock/.manager.exclusivelock

    echo "mon pid est $$ je cherche le slot qui correspond"
    local my_slot=$PIDS[$$]
    echo "lancement de $command sur le slot $my_slot"
    $command $my_slot >>./_logme.log 2>&1
    result=$?
    # je recharge la liste des slots
    . /dev/shm/SLOTS

    echo "retour de command : $result"
    if [[ "${result}" == "0" ]]; then
            echo $max_jobs>/dev/shm/n_job
            #n_job=$max_jobs
            echo "passage a $max_jobs jobs"
            for (( y = 1; y <= $max_jobs; y++ )); do
                #if ! exists $y in $SLOTS; then SLOTS[$y]="vide"; fi
                if [ ! ${SLOTS[$y]+existe} ] ;
                then 
                    SLOTS[$y]="vide"; 
                fi
            done
    else
            echo $min_jobs>/dev/shm/n_job
            #n_job=$min_jobs
            echo "passage a $min_jobs jobs"
    fi

    # je vide le slot
    SLOTS[$my_slot]="vide"

    # je sauve
    (
        # Wait for lock on /var/lock/.manager.exclusivelock (fd 200)
        flock -x -w 10 200 || return

        # je sauve
        set | grep ^SLOTS= >/dev/shm/SLOTS
    ) 200>/var/lock/.manager.exclusivelock

    echo "etat des slots APRES process"
    dumpSLOTS

    sleep .3;
}

function dumpSLOTS() {
    declare -A SLOTS;
    . /dev/shm/SLOTS
    printf "%12s" ${!SLOTS[@]}
    echo
    printf "%12s" ${SLOTS[@]}
    echo
}


while true; do

    echo "le pid du process pere est $$"

    n_job=$(</dev/shm/n_job)
    current_jobs=$(jobs -pr)
    x=0;
    for job in $current_jobs; do
            (( x++ ))
    done;

    echo "il y a $x jobs en cours, il en faut $n_job"
    jobs_to_run=$(($n_job - $x))
    if (( jobs_to_run > 0 )); then
            echo "il y a $jobs_to_run a relancer"
            for (( y = 0; y < $jobs_to_run; y++ )); do
                    echo " je lance"
                    _job_worker $command & 2>&1 | tee 
            done
    fi
    x=0
    echo "etat des jobs"
    jobs
    echo "etat des slots"
    dumpSLOTS
    echo "etat des PIDS"
    printf "%12s" ${!PIDS[@]}
    echo
    printf "%12s" ${PIDS[@]}
    echo
    sleep 1
done
Community
  • 1
  • 1

1 Answers1

0

With suggestion taken from http://wiki.bash-hackers.org/howto/mutex to manage the potential concurrency issue.

My proposition is a little ugly but should work (easy way):

  • initialize the script with creating the folder /dev/shm/my_jobs. This folder will contain files from 1 through $max_jobs
  • before the shell_exec $command >>./_logme.log 2>&1, do:
    • create a variable i=0 and test from 1 to $max_jobs until you find a suitable $i as /dev/shm/my_jobs/$i does NOT exists AND mkdir /dev/shm/my_jobs/$i is successful (f mkdir failed, it means another _job_worker has concurrently created the folder)
  • after the shell_exec $command >>./_logme.log 2>&1, do:
    • delete /dev/shm/my_jobs/$i

The problem with this is that finding $i and creating /dev/shm/my_jobs/$i must be done together. Without a way to put a lock around those 2 instructions, you may encounter concurrency issues.

Veovis
  • 151
  • 1
  • 6
  • I am currently going about this way but with a array like here : http://stackoverflow.com/questions/13207292/bash-background-process-modify-global-variable I do think that the solution ;) – Stéphane MERLE Aug 14 '14 at 10:17
  • I am still stuck here and fullfill another question about local and flock : http://stackoverflow.com/q/25311505/2346396 – Stéphane MERLE Aug 14 '14 at 15:10