How to adapt random-sampling profiling technique to a program that waits

Question

I'm quite used to randomly pausing a running program in gdb to get an idea as to where it's spending it's time, as described in How can I profile C++ code running in Linux?. It seems like this technique is most appropriate for batch processes, as opposed to interactive/real-time systems.

For a program that I'm currently working on, the vast majority of samples are in epoll_wait(). That's obviously not a candidate for a speedup, but I'd like to know what the other performance bottlenecks are.

Ideally what I'd like would be a way to generate several stack traces (together with arguments and maybe the environment) from those time intervals when the thread is not blocked by epoll_wait().

Does anyone know a good method of doing this, or should I bite the bullet and switch to using a profiler?

score 0 · Answer 1 · answered Mar 15 '17 at 11:34

0

For a program that I'm currently working on, the vast majority of samples are in epoll_wait(). That's obviously not a candidate for a speedup, but I'd like to know what the other performance bottlenecks are.

I think you can use poor man's profiler.

In it's output you should see collapsed stack trace with epoll_wait in top line. You already know that it is not a bottleneck in you code so you should skip it and look at the next line of output to see more appropriate candidate for optimization.

answered Mar 15 '17 at 11:34

ks1322

33,961
14
109
164

I should have mentioned, I'm okay with it spending a large amount of time waiting for network events: that's part of the essential nature of the program. – user3445329 Mar 15 '17 at 23:43
Thanks for the link, that does look useful. – user3445329 Mar 15 '17 at 23:43

Mike Dunlavey · Answer 2 · 2017-03-15T15:55:38.843

I use random pausing in UI programs. I run the whole thing under a debugger, and only pause it when it's taking time, i.e. when I'm waiting for it.

If you see it landing in epoll_wait, the call stack should say why it's waiting. Is it possible that it's doing I/O you might avoid? If you can't avoid the I/O you simply have an I/O-bound program.

The kind of thing I've seen is during application startup it seems to take a long time. The stack sample often shows it is 30+ levels deep in loading plugins. If I read the stack I see that as part of loading the plugin it extracts a string resource from the dll in order to get the plugin's name, which it then translates according to the country it's in. The reason it does that is so it can paint a string saying something like "Loading Plugin FooBar", so the user will know why it's taking so long.

It can be increasing the startup time by a factor of 2 just so it can tell the user why it's slow!

Needless to say, it's not too hard to fix that...

Other issues are similar. Note: You don't get any of this insight just by knowing how much inclusive time functions take, even if it's wall-clock time. That's the difference between pausing and any other profiling method. It tells you the why, without which you can't tell if the activity is unnecessary.

My program is mostly handling network events: something like a web server. It typically only takes a few milliseconds to service each request, but when a lot of requests come in, the program can fall behind on servicing them. I'm trying to improve the performance in those peak times. — user3445329, Mar 15 '17 at 23:40
@user3445329: You want to raise the number of requests per second it can handle? Then what I would do is artificially generate that high traffic, and then use sampling to see what is actually holding it up, because something is. I'm not saying it's necessarily easy to do this (we're not paid to solve easy problems). I'm saying there is no profiler that will give you better information. — Mike Dunlavey, Mar 16 '17 at 12:36

How to adapt random-sampling profiling technique to a program that waits

2 Answers2