Is there any way / binary for a semaphore-like structure? Eg. For running a fixed amount of (background) sub-process as we loop through a directory of files (using word "sub-process" here and not "thread", since using an appended &
in my bash commands to do the "multithreading" (but would be open to any more convenient suggestions)).
My actual use case is trying to use a binary called bcp
on CentOS 7 to write a (variable sized) set of TSV files to a remote MSSQL Server DB and have observed that there seems to be a problem with the program when running too many threads. Eg. something like
for filename in $DATAFILES/$TARGET_GLOB; do
if [ ! -f $filename ]; then
echo -e "\nFile $filename not found!\nExiting..."
exit 255
else
echo -e "\nImporting $filename data to $DB/$TABLE"
fi
echo -e "\nStarting BCP export threads for $filename"
/opt/mssql-tools/bin/bcp "$TABLE" in "$filename" \
$TO_SERVER_ODBCDSN \
-U $USER -P $PASSWORD \
-d $DB \
$RECOMMEDED_IMPORT_MODE \
-t "\t" \
-e ${filename}.bcperror.log &
done
# collect all subprocesses at the end
wait
that starts a new sub-process for every file all at once in an unrestricted way, appears to crash each sub-process. Would like to see if adding a semaphore-like structure into the loop to lock the number of sub-process that will be spun up would help. Eg. something like (using some non-bash-like pseudo-code here)
sem = Semaphore(locks=5)
for filename in $DATAFILES/$TARGET_GLOB; do
if [ ! -f $filename ]; then
echo -e "\nFile $filename not found!\nExiting..."
exit 255
else
echo -e "\nImporting $filename data to $DB/$TABLE"
fi
sem.lock()
<same code from original loop>
sem.unlock()
done
# collect all subprocesses at the end
wait
If anything like this is possible or if this is a common problem with an existing best practice solution (I'm pretty new to bash programming), advice would be appreciated.