0

I am observing some very curious behavior on one of the production environments I administrate. I have never seen anything like it, maybe someone of you have an idea.

Basically I have two shell-scripts, let's call them the master and slave script. The master script is the one I start from the bash command line and the slave script is called from within the master script.

Here is the relevant part of the master script that calls the slave script (called 'import_CSV2UNISERV_L.sh'):

SCRIPT_EXEC_IDS=$($SCRIPTDIR/import_CSV2UNISERV_L.sh init NL_INTER kub-b.puc.ov.otto.de 3>&1 1>&4)
typeset -i RETCODE=$?

Subsequently the slave script at some point will end execution with an "exit 0", at which point I would expect the master script to continue execution with the "typeset" command. but that never happens.

When i execute the scripts with "bash -x", I can see that the "exit 0" is still processed, but after that the execution just stops and hangs forever. This is the last output one can observe:

+ echo '15.10.2014 07:40:55 AM - Ende import_CSV2UNISERV_L.sh'
15.10.2014 07:40:55 AM - Ende import_CSV2UNISERV_L.sh
+ exit 0

I have no clue what is causing this or even how to debug this any further, I am absolutely lost :-(

I did find out that when the "exit 0" is the first command in the slave-script, then everything works fine. So it seems to have to do with something I do in the slave-script, but that script is hundreds of lines long, so it would be close to impossible to find out by trial-and-error which line causes this.

The other peculiar thing about all of this is, that those scripts have been unmodified in the production environment since 2010 (I checked the subversion!), they ran every single day since then without a hitch. Only since last weekend, when we had a software release that should have been unrelated to these scripts, the problems occur. So there seems to be a connection but I don't see where.

I guess I am looking more for a way to debug this further, how to figure out what blocks the execution than for a full fledged solution (which would be expecting a bit much :-) )

Any ideas on how I could move forward with this situation would be very much appreciated. Thank you in advance for taking the time!

Best regards

Mario

iqstatic
  • 2,322
  • 3
  • 21
  • 39
  • Given recent events (shellshock), I'm assuming you have a new version of `bash`. I would check for differences in the old/new versions of bash, and see if there were any incompatible changes (e.g., how arrays work differs across versions). See also http://stackoverflow.com/questions/9080431/how-execute-bash-script-line-by-line – michael Oct 16 '14 at 06:23
  • 1
    a coworker of mine was able to figure out the problem. with the advent of the software release last weekend they introduced additional output streams (i'm not sure i'm using the right terminology here...i didn't understand completely what he was explaining to me). my script was starting background processes with nohup, which inherited those additional output streams and therefore the calling shell was blocked until those bacjground processes terminated. the fix was to add "3>&- 4>&- " to the end of the nohup command line, which apparently closes those additional streams. – Mario Köhler Oct 16 '14 at 11:57

0 Answers0