0

I'm using a modified auditd package, and linked gprof with it.

But from the output of gprof XX | gprof2dot | dot -Tpng output.png, all functions are 100% usage, so I cannot tell who's consuming most of the CPU resources.

(Right click on the image to have a full view)

enter image description here

Any ideas? Is that normal, and most importantly, how should I know which function consumes most CPU?

P.S: original gprof output http://pastie.org/8546568

daisy
  • 22,498
  • 29
  • 129
  • 265
  • What does straight `gprof` (without `gprof2dot`) output? – Joe Z Dec 12 '13 at 06:55
  • How long did this program run for? It seems like it collected no samples. – Joe Z Dec 12 '13 at 07:05
  • @JoeZ for 30 seconds, I have it printed logs, so I know it works, then I send a SIGINT to let it terminate – daisy Dec 12 '13 at 07:10
  • If you run the command with `time`, how much `user` time does it use? If it spends most of its time in system calls, network I/O or disk I/O, it may not collect any samples. – Joe Z Dec 12 '13 at 07:10
  • Also, when you send the `SIGINT`, is there a `SIGINT` handler that terminates the program in an orderly manner, or does it exit without flushing files? It could be that the `gprof` instrumentation didn't flush its collected statistics. – Joe Z Dec 12 '13 at 07:18
  • @JoeZ it's terminated, I started auditd in foreground, I can see it exists peacefully – daisy Dec 12 '13 at 07:46
  • I still wonder if the `SIGINT` is the issue. Is there a way you can tweak `auditd` to exit on its own after 30s, rather than you sending `SIGINT`? – Joe Z Dec 12 '13 at 13:29
  • I wouldn't touch `gprof` (and those rats-nest graphs). First, you're asking the wrong question. Don't ask "Which function consumes most CPU?", which is ambiguous and irrelevant. Ask "*Why* is the time being spent?" As long as you're doing SIGINT, use `gdb` and [*this technique*](http://stackoverflow.com/a/378024/23771). If you're got a performance problem, you'll see exactly what it is. – Mike Dunlavey Dec 12 '13 at 16:42
  • @JoeZ I use SIGALRM plus a handler to make auditd exit, same result, time field of all functions are zero – daisy Dec 15 '13 at 13:38
  • @warl0ck : I guess I wasn't clear enough. Can you make `auditd` exit via `exit()` or returning from `main()`, without involving a signal? – Joe Z Dec 15 '13 at 17:34

1 Answers1

0

At the top of the flat profile (in the pastie.org link) is the immediate answer:

Flat profile:

Each sample counts as 0.01 seconds.
 no time accumulated

Since everything took exactly no time at all, everything took the same amount of time as everything else.

The more interesting question is, why did everything take no time at all? The problem here is that auditd is setuid, and (at least in general) you cannot profile setuid programs.

(To get a profile of a program that would be setuid, make a non-suid version of it, and profile that. Sometimes you need to tweak the internals a bit to make it behave well now that it's not setuid.)

After poking around some more, I don't know why it fails. Linux uses a libc compatibility routine to fake the profil() system call for gprof. This needs the SIGPROF-based ITIMER_PROF interval timer but should generally work.

An alternative is to use OProfile, which is more general-purpose (it can profile system calls) and probably better-supported these days.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Hmm I started audited as root, and not forked to background. It's not suided .. how can you tell that ;-P – daisy Dec 12 '13 at 07:45
  • `-rwxr-xr-x 1 root root 296K Dec 12 XX:YY auditd*` – daisy Dec 12 '13 at 07:46
  • Huh, maybe I'm wrong. (I don't have a Linux box handy to check.) But it definitely did not take any profile samples. Now I'm not sure *why* though. Note that the flat profile has the call counts, so the counting part works, it's just the CPU sampling that doesn't. – torek Dec 12 '13 at 07:48
  • Another possibility: [this man page](http://man7.org/linux/man-pages/man3/profil.3.html) implies that Linux `profil` is a libc routine that uses `setitimer(ITIMER_PROF)`, rather than a system call. Maybe auditd is clearing out the timer? – torek Dec 12 '13 at 07:59