0

I have a program where i test different data sets and configuration. I have a script to execute all of those.

imagine my code as :

start = omp_get_wtime()
  function()
end = omp_get_wtime() 
print(end-start) 

and the bash script as

for a in "${first_option[@]}"
do 
  for b in "${second_option[@]}"
  do 
    for c in "${third_option[@]}"
    do
       printf("$a $b $c \n")
       ./exe $a $b $c >> logs.out 
    done 
  done
done 

now when i execute the exact same configurations by hand, i get varying results from 10 seconds to 0.05 seconds but when i execute the script, i get the same results on the up side but for some reason i can't get any timings lower than 1 seconds. All the configurations that manually compute at less than a second get written in the file at 1.001; 1.102; 0.999 ect...

Any ideas of what is going wrong?
Thanks

Daniel Bar
  • 65
  • 7
  • Are the arrays huge? Keeping them in memory is probably a problem. – tripleee Jan 16 '23 at 11:13
  • 1
    You go through all this trouble to quote the arrays ... and then botch it by not quoting the final variables inside the loop! [When to wrap quotes around a shell variable](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) – tripleee Jan 16 '23 at 11:13
  • 2
    That `printf` is a syntax error. The proper syntax would be `printf "%s %s %s\n" "$a" "$b" "$c"` where we also take care to avoid putting data in the format string. (You could use a different format specifier than `%s` if the values are always numbers, for example.) – tripleee Jan 16 '23 at 11:15
  • I've got about 10 arrays with 5 elements each so i dont think that could saturate my memory. And i don't see how a slow bash script will change something on the time measure of a different process – Daniel Bar Jan 16 '23 at 11:21
  • Thanks for the rest of the information, i'll change the code to fix that but my problem doesn't come from this since i know it executes correctly – Daniel Bar Jan 16 '23 at 11:23
  • Indeed, 30 array elements should barely be noticeable. The problem is probably not with your Bash script. – tripleee Jan 16 '23 at 11:43
  • 1
    Appending to the output file is also a bottleneck, especially if the log is large. You can avoid that by moving `>> logs.out` to after the last `done` (and probably then replace the `>>` with `>` to replace the previous log?) – tripleee Jan 16 '23 at 11:44
  • log is also quite small, usually 10 lines with 1 timer per line and moving it to a `>` at the end will have me change everything about the production code , especially when the code doesn't seem the have problems when it is manually executed... – Daniel Bar Jan 16 '23 at 12:04

1 Answers1

0

My suggestion would be to remove the ">> logs.out" to see what happens with the speed.

From there you can try several options:

  • Replace ">> log.out" with "| tee -a log.out"
  • Investigate stdbuf and if your code is python, look at "PYTHONUNBUFFERED=1" shell variable. See also: How to disable stdout buffer when running shell
  • Redirect bash print command with ">&2" (write to stderr) and move ">> log.out" or "| tee -a log.out" behind the last "done"
  • You can probably see what is causing the delay by using:
      strace -f -t bash -c "<your bash script>" | tee /tmp/strace.log
    
    With a little luck you will see which system call is causing the delay on the bottom of the screen. But it is a lot of information to process. Alternatively look for the name of your "./exe" in "/tmp/strace.log" after tracing is done. And then look for the system calls after invocation (process start of ./exe) that eat most time. Could be just many calls ... Don't spent to much time on this if you don't have the stomach for it.
dekerser
  • 51
  • 4
  • The code is in Fortran. I have removed the ">>" to just print to terminal to try it and i still get 1 seconds output – Daniel Bar Jan 16 '23 at 13:43
  • So then it is probably buffering to output, try the ">&2" of the third suggestion, while keeping ">> out.log" removed. Or try setting the GFORTRAN_UNBUFFERED_ALL environment variable to 'y', 'Y' or 1. – dekerser Jan 16 '23 at 14:08
  • Didn't work either, still output 1 second of execution time with a configuration that normally uses around 0.05s ... – Daniel Bar Jan 17 '23 at 14:15
  • There is also GFORTRAN_UNBUFFERED_PRECONNECTED environment variable to try. But I'm not optimistic as _ALL would logically include them. This one is special for stderr and stdout, which would be the problem area here. To check that your ./exe has the options: `strings ./exe | grep BUFFER` – dekerser Jan 18 '23 at 16:56
  • If they aren't there you need to reconsider your compile options... It could also be that they are slightly different if you are not using gnu fortran. Another option to try could be to run your script in foreground. To do so first remove the "exit" command at the end of the script, as it will exit your shell after running. To run in foreground: `. ./` Note the '.' at the beginning this is absolute essential – dekerser Jan 18 '23 at 17:08
  • The problem is fixed, it came from my environnement. I execute using OpenMP parallelisation and i was playing around with chunk size and on my smaller data sets, i had a chunk too big for too little iterations, basicly slowing down the entire program to 1 second – Daniel Bar Jan 19 '23 at 17:21
  • At least you found what it was :-) – dekerser Jan 19 '23 at 17:48