I can use gperftools to produce a call graph, as in for instance this question.
Now I would like to get a call graph for bind_rows()
in the dplyr
R package in order to track down this bug.
I compiled both R
and dplyr
using CPP/CXXFLAGS=-g -fvar-tracking-assignments
and LDFLAGS=-lprofiler -lunwind
.
When I run the following:
CPUPROFILE="samples.log" R --vanilla <<< "library(dplyr)
ll = lapply(1:1e5, function(x) as.list(setNames(runif(5), letters[1:5])))
print(system.time(bind_rows(ll)))"
pprof --gif /usr/lib/R/bin/exec/R samples.log > out.gif
All I get is:
How can I get the call hierarchy so I know which call in dplyr
's bind rows file is the bottleneck?
edit: It seems that the --focus
option is what I need here. But how to connect this to RecursiveRelease
?
pprof --focus=rbind__impl --gif /usr/lib/R/bin/exec/R samples.log > out.gif
edit: After recompiling Rcpp with -g
and linking with -lprofiler
, I could get the following: flame.svg, where 8% gets a good stack trace but most of it still doesn't. Could this be because some library is loaded without -lprofiler
support?