27

I am interested in profiling some Rcpp code under OS X (Mountain Lion 10.8.2), but the profiler crashes when being run.

Toy example, using inline, just designed to take enough time for a profiler to notice.

library(Rcpp)
library(inline)

src.cpp <- "
  RNGScope scope;
  int n = as<int>(n_);
  double x = 0.0;
  for ( int i = 0; i < n; i++ )
    x += (unif_rand()-.5);
  return wrap(x);"

src.c <- "
  int i, n = INTEGER(n_)[0];
  double x = 0.0;
  GetRNGstate();
  for ( i = 0; i < n; i++ )
    x += (unif_rand()-.5);
  PutRNGstate();
  return ScalarReal(x);"

f.cpp <- cxxfunction(signature(n_="integer"), src.cpp, plugin="Rcpp")
f.c <- cfunction(signature(n_="integer"), src.c)

If I use either the GUI Instruments (in Xcode, version 4.5 (4523)) or the command line sample, both crash: Instruments crashes straight away, while sample completes processing samples before crashing:

# (in R)
set.seed(1)
f.cpp(200000000L)

# (in a separate terminal window)
~ » sample R  # this invokes the profiler
Sampling process 81337 for 10 seconds with 1 millisecond of run time between samples
Sampling completed, processing symbols...
[1]    81654 segmentation fault  sample 81337

If I do the same process but with the C version (i.e., f.c(200000000L)) both Instruments and sample work fine, and produce output like

Call graph:
1832 Thread_6890779   DispatchQueue_1: com.apple.main-thread  (serial)
  1832 start  (in R) + 52  [0x100000e74]
    1832 main  (in R) + 27  [0x100000eeb]
      1832 run_Rmainloop  (in libR.dylib) + 80  [0x1000e4020]
        1832 R_ReplConsole  (in libR.dylib) + 161  [0x1000e3b11]
          1832 Rf_ReplIteration  (in libR.dylib) + 514  [0x1000e3822]
            1832 Rf_eval  (in libR.dylib) + 1010  [0x1000aa402]
              1832 Rf_applyClosure  (in libR.dylib) + 849  [0x1000af5d1]
                1832 Rf_eval  (in libR.dylib) + 1672  [0x1000aa698]
                  1832 do_dotcall  (in libR.dylib) + 16315  [0x10007af3b]
                    1382 file1412f6e212474  (in file1412f6e212474.so) + 53  [0x1007fded5]  file1412f6e212474.cpp:16
                    + 862 unif_rand  (in libR.dylib) + 1127,1099,...  [0x10000b057,0x10000b03b,...]
                    + 520 fixup  (in libR.dylib) + 39,67,...  [0x10000aab7,0x10000aad3,...]
                    356 file1412f6e212474  (in file1412f6e212474.so) + 70,61,...  [0x1007fdee6,0x1007fdedd,...]  file1412f6e212474.cpp:16
                    56 unif_rand  (in libR.dylib) + 1133  [0x10000b05d]
                    38 DYLD-STUB$$unif_rand  (in file1412f6e212474.so) + 0  [0x1007fdf1c]

I would really appreciate some advice into if there is anything I'm doing wrong, if there is some other preferred way, or if this is just not possible. Given that one of the main uses of Rcpp seems to be in speeding up R code, I'm surprised not to find more information on this, but perhaps I'm looking in the wrong place.

This is on OS X 10.8.2 with R 2.15.1 (x86_64-apple-darwin9.8.0), Rcpp 0.9.15, and g++ --version reports "i686-apple-darwin11-llvm-g++-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)".

A solution

Thanks to Dirk's answer below, and his talk here http://dirk.eddelbuettel.com/papers/ismNov2009introHPCwithR.pdf, I have at least a partial solution using Google perftools. First, install from here http://code.google.com/p/gperftools/, and add -lprofiler to PKG_LIBS when compiling the C++ code. Then either

(a) Run R as CPUPROFILE=samples.log R, run all code and quit (or use Rscript)

(b) Use two small utility functions to turn on/off profiling:

RcppExport SEXP start_profiler(SEXP str) {
  ProfilerStart(as<const char*>(str));
  return R_NilValue;
}

RcppExport SEXP stop_profiler() {
  ProfilerStop();
  return R_NilValue;
}

Then, within R you can do

.Call("start_profiler", "samples.log")
# code that calls C++ code to be profiled
.Call("stop_profiler")

either way, the file samples.log will contain profiling information. This can be looked at with

pprof --text /Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R samples.log

which produces output like

Using local file /Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R.
Using local file samples.log.
Removing __sigtramp from all stack traces.
Total: 112 samples
  64  57.1%  57.1%       64  57.1% _unif_rand
  30  26.8%  83.9%       30  26.8% _process_system_Renviron
  14  12.5%  96.4%      101  90.2% _for_profile
   3   2.7%  99.1%        3   2.7% Rcpp::internal::expr_eval_methods
   1   0.9% 100.0%        1   0.9% _Rf_PrintValueRec
   0   0.0% 100.0%        1   0.9% 0x0000000102bbc1ff
   0   0.0% 100.0%       15  13.4% 0x00007fff5fbfe06f
   0   0.0% 100.0%        1   0.9% _Rf_InitFunctionHashing
   0   0.0% 100.0%        1   0.9% _Rf_PrintValueEnv
   0   0.0% 100.0%      112 100.0% _Rf_ReplIteration

which would probably be more informative on a real example.

Rich
  • 273
  • 3
  • 6
  • 2
    +1 -- perftools, the gcc profiler or valgrind can all help. The fact that R is interactive does throw a spanner in here as it adds more code layers. – Dirk Eddelbuettel Nov 05 '12 at 04:21
  • I ran into this myself when I upgraded my XCode, but I thought it might just be me. The thing is that Shark worked perfectly for profiling Rcpp before, but it is now defunct. I guess we can call this an Apple bug. – Ian Fellows Nov 06 '12 at 04:48

1 Answers1

4

I'm confused, your example is incomplete:

  • you don't show the (trivial) invocation of cfunction() and cxxfunction()

  • you don't show how you invoke the profiler

  • you aren't profiling the C or C++ code (!!)

Can you maybe edit the question and make it clearer?

Also, when I run this, the two example do give identical speed results as they are essentially identical. [ Rcpp would let you do this in call with sugars random number functions. ]

R> library(Rcpp)
R> library(inline)
R> 
R> src.cpp <- "
+   RNGScope scope;
+   int n = as<int>(n_);
+   double x = 0.0;
+   for ( int i = 0; i < n; i++ )
+     x += (unif_rand()-.5);
+   return wrap(x);"
R> 
R> src.c <- "
+   int i, n = INTEGER(n_)[0];
+   double x = 0.0;
+   GetRNGstate();
+   for ( i = 0; i < n; i++ )
+     x += (unif_rand()-.5);
+   PutRNGstate();
+   return Rf_ScalarReal(x);"
R> 
R> fc   <- cfunction(signature(n_="int"), body=src.c)
R> fcpp <- cxxfunction(signature(n_="int"), body=src.c, plugin="Rcpp")
R> 
R> library(rbenchmark)
R> 
R> print(benchmark(fc(10000L), fcpp(10000L)))
         test replications elapsed relative user.self sys.self user.child sys.child
1   fc(10000)          100   0.013        1     0.012        0          0         0
2 fcpp(10000)          100   0.013        1     0.012        0          0         0
R> 
Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • 1
    Hi Dirk, and thanks for helping! I've edited the example above for clarity, I hope. The profiler is invoked with the 'sample' command, and I've included output from the C version to show that it is C that is being profiled. I've used this general approach successfully with R+C code before, but not with C++. The examples should give identical results, and aren't representative of the problem I'm actually trying to profile. Thanks again, Rich – Rich Nov 05 '12 at 00:38
  • 1
    I see now -- I know nothing about the OS X aspect of this. For straight up `gcc` et al, I put some notes about profiling into the 'Intro to HPC with R' slides on my website / from previous talks. You will need to see with r-sig-mac whether `sample` works on the R toolchain. – Dirk Eddelbuettel Nov 05 '12 at 02:08