Callstack sampling in Erlang

Question

I am currently investigating a performance issue within a large Erlang application. The application exhibits larger-than-expected CPU load. To get a first grasp which parts of the system are responsible for the load, I'd like to perform callstack sampling as described in this answer.

Is there a better way to do this than calling erlang:process_info(Pid, backtrace) repeatedly and grepping the functions from that output?

Note that the system is too large to use fprof, and that etop did not point me into the right direction as well. Using fprof for only parts of the system is not possible right now as well, as I first need to pin-point the general location of the performance issue.

I don't know Erlang, but I know that technique. Can you run your program under a debugger, manually pause it, and examine the call stack? I wouldn't worry about first finding the general location of the performance issue, because if it takes a big percent of time that's the probability you will see it, and if it takes a small percent, chances are something else you didn't suspect takes a big percent. — Mike Dunlavey, Feb 15 '17 at 16:18
@MikeDunlavey thanks for the suggestion (I just learned how to suspend and resume processes in Erlang because of it). Although Erlang allows suspending processes, I do not see a way yet to simplify callstack sampling. With `erlang:process_info(Pid, backtrace)`, I can retrieve stack traces at runtime and do not need to suspend processes myself. But then I am left with parsing that output. — evnu, Feb 15 '17 at 21:16
It's important *not* to put the sampling into the code. It has to happen at a time that's unpredictable from the viewpoint of the program. The other point is that you don't need a lot of samples (contrary to widespread assumption). You only need to see a problem twice to know it's worth fixing, and the average number of samples needed for that is 2 / size_of_problem. So if the problem costs 30% of time, the average number of samples needed to see it twice is 2 / 0.30 = 6.67 samples. — Mike Dunlavey, Feb 16 '17 at 02:08

score 1 · Answer 1 · answered Feb 20 '17 at 14:26

1

A simple way to get the actual size of the stack is process_info(Pid, stack_size). While this only return the size of the stack in words it is a very simple and efficient way of seeing which processes have large stacks.

answered Feb 20 '17 at 14:26

rvirding

20,848
2
37
56

Wouldn't `reductions` be the better measure to determine the origin of large load? I would expect that the processes putting much load on the system will display a consistently large number of `reductions`. Preliminary tests with `stack_size` were inconclusive (or I misinterpreted the numbers) – evnu Feb 20 '17 at 16:34
1

Yes, `reductions` would be a better measure but this was a comment the original question asking about an alternative to `backtrace`. – rvirding May 10 '17 at 11:06

Callstack sampling in Erlang

1 Answers1