2

From this answer on the Time Unix command I get the basic idea on how time works. It forks a new process and execute the command in that new process. However, I have encountered a behavior that I am not able to understand.

I am trying to profile lusearch, a benchmark of the DaCapo benchmark suite. I want to launch it with different configurations (number of threads and number of iterations), disregard the benchmark output and use time to record real, user and system time. With the vast majority of the configurations, my script works just fine, launches the benchmark and records the time.

With one particular configuration (large dataset, two threads and ten iterations) the benchmark sometimes does not reach termination (80% of the cases, out of almost 100 tentatives). This is the command I am using to launch it:

(time -p java -jar DaCapo.jar lusearch -s large -t 2 -i 10 
 >/dev/null 2>/dev/null) 2>&1 | awk '{print $2 $4 $6}' > timed &

However, if I don't prepend time the benchmark just terminates 100% of the times (in about 100 tentatives as well):

(java -jar DaCapo.jar lusearch -s large -t 2 -i 10 >/dev/null 2>/dev/null)

This behavior happens only with this benchmark - and with this configuration -, while if profile some other benchmark or use a different number number of threads or a different number of iterations I don't see the same thing happening. My guess would be that it has to do with something that time is doing that interferes with the benchmark.

I don't see how a fork+exec could change the benchmark behavior. Is there anything specific that can cause this? For example: is time using some resource that the benchmark wants to use as well? I am doing something wrong while launching the benchmark?

Community
  • 1
  • 1
  • Note there's a shell builtin 'time' and an executable /usr/bin/time. Do you know which one you're using ? (likely the former) – Brian Agnew Dec 13 '12 at 12:10
  • You are right, I am using the former. – Martina Maggio Dec 13 '12 at 12:12
  • Since time(s) uses one process slot it and some other resources, it *could be* that the testrig hits some resource limits (number of processes, threads, filedescriptors, Memory?) Heisenberg, omelet <--> eggs, Anything goes... – wildplasser Dec 13 '12 at 23:35
  • @wildplasser In Bash, `time` is technically a modifier that's part of the pipeline syntax. Grammar-wise, it's exactly the same as the negation operator (`!`). In that respect it's unlike all other builtin commands and keywords. There's no extra process or fd or significant memory involved that I can think of to cause a side-effect. – ormaaj Dec 13 '12 at 23:46
  • Well, it could be a hidden bash feature... I did not check the code, but in that case apparently bash does a standard fork+exec+wait+getrusage. It would consume a process slot, but that is what a "straight" fork+exec+wait() would cost also. I rest my case. – wildplasser Dec 13 '12 at 23:52

1 Answers1

1

"Doesn't reach termination" is pretty vague. You're going to miss any error messages since you're sending everything to /dev/null. It's impossible to give a definite answer without knowing anything about what this program does, no error messages, no backtrace.

The only actual difference I can think of is you're not backgrounding it in the second case. Bash with certain combinations of pipes and redirects doesn't redirect /dev/null to stdin implicitly upon backgrounding. It's possible that some part of the program is sensitive to that. See my testcases for some examples where explicit redirection is required. Bash doesn't completely follow POSIX here and differs from all other shells with certain combinations of asynchronous lists, pipes, and redirects.

I seriously doubt it has anything to do with time. If you don't get a better answer here maybe ask the help-bash list, but they won't be able to better without better information.

ormaaj
  • 6,201
  • 29
  • 34
  • 1
    Thanks for the answer. I tried sending the second command in background and I had the same results. If I don't redirect stderr to /dev/null I get an exception stack trace saying that the file system resource used has already been closed (not deterministic: not always in the same place). However, I don't really see how this can happen "only" when I use "time" (never happened in other circumstances). – Martina Maggio Dec 14 '12 at 09:53
  • 1
    Heh that's interesting. Can you reproduce it in other shells? All that syntax should work in any POSIX sh. If not it might be worth reporting a bug (of course, test with the most recent Bash version and patcheset first). Using a system call tracer (strace) might help reveal the problem too. Figures a benchmark doing weird low-level manipulations might trigger something like this. – ormaaj Dec 14 '12 at 15:03