20

Recently I've been experimenting with Rcpp (inline) to generate DLLs that perform various tasks on supplied R inputs. I'd like to be able to debug the code in these DLLs line by line, given a specific set of R inputs. (I'm working under Windows.)

To illustrate, let's consider a specific example that anybody should be able to run...

The code below is a really simple cxxfunction which simply doubles the input vector. Note however that there's an additional variable myvar that changes value a few times but doesn't affect the output - this has been added so that we'll be able to see when the debugging process is running correctly.

library(inline)
library(Rcpp)

f0 <- cxxfunction(signature(a="numeric"), plugin="Rcpp", body='
    Rcpp::NumericVector xa(a);
    int myvar = 19;
    int na = xa.size();
    myvar = 27;
    Rcpp::NumericVector out1(na);
    for(int i=0; i < na; i++) {
        out1[i] = 2*xa[i];
        myvar++;
    }
    myvar = 101;
    return(Rcpp::List::create( _["out1"] = out1));
')

After we run the above, typing the command

getLoadedDLLs()

brings up a list of DLLs in the R session. The last one listed should be the DLL created by the above process - it has a random temporary name, which in my case is

file7e61645c

The "Filename" column shows that cxxfunction has put this DLL in the location tempdir(), which for me is currently

C:/Users/TimP/AppData/Local/Temp/RtmpXuxtpa/file7e61645c.dll

Now, the obvious way to call the DLL is via f0, as follows

> f0(c(-7,0.7,77))

$out1
[1] -14.0   1.4 154.0

But we can of course also call the DLL directly by name using the .Call command:

> .Call("file7e61645c",c(-7,0.7,77))

$out1
[1] -14.0   1.4 154.0

So I've reached the point where I'm calling a standalone DLL directly with R input (here, the vector c(-7,0.7,77)), and having it return the answer correctly to R.

What I really need, though, is a facility for line-by-line debugging (using gdb, I presume) that will allow me to observe the value of myvar being set to 19, 27, 28, 29, 30, and finally 101 as the code progresses. The example above is deliberately set up so that calling the DLL tells us nothing about myvar.

To clarify, the "win condition" here is being able to observe myvar changing (seeing the value myvar=19 would be the first step!) without adding anything else to the body of the code. This obviously may require changes to the way in which the code is compiled (are there debugging mode settings to turn on?), or the way R is called - but I don't know where to begin. As noted above, all of this is Windows-based.

Final note: In my experiments, I actually made some minor modifications to a copy of cxxfunction so that the output DLL - and the code within it - receives a user-defined name and sits in a user-defined directory, rather than a temporary name and location. But this doesn't affect the essence of the question. I mention this just to emphasise that it should be fairly easy to alter the compilation settings if someone gives me a nudge :)

For completeness, setting verbose=TRUE in the original cxxfunction call above shows the compilation argument to be of the following form:

C:/R/R-2.13.2/bin/i386/R CMD SHLIB file7e61645c.cpp 2> file7e61645c.cpp.err.txt 
g++ -I"C:/R/R-213~1.2/include"    -I"C:/R/R-2.13.2/library/Rcpp/include"      -O2 -Wall  -c file7e61645c.cpp -o file7e61645c.o
g++ -shared -s -static-libgcc -o file7e61645c.dll tmp.def file7e61645c.o C:/R/R-2.13.2/library/Rcpp/lib/i386/libRcpp.a -LC:/R/R-213~1.2/bin/i386 -lR

My adapted version has a compilation argument identical to the above, except that the string "file7e61645c" is replaced everywhere by the user's choice of name (e.g. "testdll") and the relevant files copied over to a more permanent location.

Thanks in advance for your help guys :)

Tim P
  • 1,383
  • 9
  • 19
  • I can't help directly, but I know Dirk etc are always helpful. However they generally do business on the [Rcpp email list](http://lists.r-forge.r-project.org/mailman/listinfo/rcpp-devel) – Jase_ Jul 05 '12 at 15:15
  • Have made some mild progress on this, so a brief update. Playing around with inline:::compileCode, which gets called within cxxfunction, I found that adding `--debug` at the end of `R CMD SHLIB` allowed me to examine what's going on inside the DLL through combining gdb and R. HOWEVER, this isn't a full solution as some variables were inaccessible (such as `i`, during the loop over `i`); the message came up that they had been "optimized out". I think I therefore need to replace the "-O2" with "-O0" in the compilation argument... but I've no notion of how to make this happen... – Tim P Jul 05 '12 at 16:18
  • I've gathered a good bit of evidence that all I need to do is change the R CMD SHLIB compiler flags from -O2 to something like -g -O0 ...e.g. see the post at https://stat.ethz.ch/pipermail/r-devel/2008-November/051390.html - but I'm lacking a precise statement of what needs specifying and how. Some online sources mention creating a file at `/.R/Makevars.win` but they don't describe the file, and this isn't a valid location in Windows due to the dot before the letter R... – Tim P Jul 05 '12 at 17:22
  • @Jase: Thanks, but actually my last post to that list received no replies at all! I posted here as it's obviously a question of general interest (seeing as the fix for the bug seems to be related to changing the compiler flags). – Tim P Jul 06 '12 at 09:40
  • Of course `~/.R/Makevars` is valid in Windoze -- below my $HOME I have a ton of files starting with a dot, as well as directories (frequently created by Unix-y programs ported to Windoze). And yes, you need `-g` if you want debugging output. – Dirk Eddelbuettel Jul 06 '12 at 13:19

1 Answers1

19

I am a little stunned by the obsession some Rcpp users have with the inline package and its cxxfunction(). Yes, it is indeed very helpful and it has surely has driven adoption of Rcpp further as it makes quick experimentation so much easier. Yes, it allowed us to use 700+ unit tests in the sources. Yes, I use it all the time to demonstrate examples here, on the rcpp-devel list or even live in presentations.

But does that mean we should use it for each and every task? Does it mean that it does not have "costs" such as randomized filenames in a temporary directory etc pp? Romain and I argued otherwise in our documentation.

Lastly, debugging of dynamically loaded R modules is difficult as it stands. There is an entire section in the (mandatory) Writing R Extensions about it, and Doug Bates once or twice posted a tutorial about how to do this via ESS and Emacs (though I always forget where he posted it; once was IIRC on the rcpp-devel list).

Edit 2012-Jul-07:

Here is your step by step:

  • (Preamble: I've used gcc and g++ for many years, and even when I add -g I don't always turn -O2 into -O0. I am really not sure you need that, but as you ask for it...)

  • Set your environment variable CXXFLAGS to "-g -O0 -Wall". There numerous ways to do it, some are platform-dependent (eg Windows control panel) and therefore less universal and interesting. I use ~/.R/Makevars on Windows and Unix. You could use that, or you could override R's system-wide $RHOME/etc/Makeconf or you could use Makeconf.site or ... See the full docs---but as I said, ~/.R/Makevars is my preferred way as it does NOT interfere with compilation outside of R.

  • Now every compilation R does via R CMD SHLIB, R CMD COMPILE, R CMD INSTALL, ... will use. So it no longer matters you use inline or a local package. Continuing with inline...

  • For the rest, we mostly follow 'Section 4.4.1 Finding entry points in dynamically loaded code' of "Writing R Extensions":

  • Start another R session with R -d gdb.

  • Compile your code. For

fun <- cxxfunction(signature(), plugin="Rcpp", verbose=TRUE, body='
   int theAnswer = 42;
   return wrap(theAnswer);
')

I get

[...]
Compilation argument:
 /usr/lib/R/bin/R CMD SHLIB file11673f928501.cpp 2> file11673f928501.cpp.err.txt 
 ccache g++-4.6 -I/usr/share/R/include -DNDEBUG   -I"/usr/local/lib/R/site- library/Rcpp/include"   -fpic  -g -O0 -Wall -c file11673f928501.cpp -o file11673f928501.o
g++-4.6 -shared -o file11673f928501.so file11673f928501.o -L/usr/local/lib/R/site-library/Rcpp/lib -lRcpp -Wl,-rpath,/usr/local/lib/R/site-library/Rcpp/lib -L/usr/lib/R/lib -lR
  • Invoke eg tempdir() to see the temporary directory, cd to this temporary directory used above and dyn.load() the file built above:
 dyn.load("file11673f928501.so")
  • Now suspend R by sending a break signal (in Emacs, a simple choice from a drop-down).

  • In gdb, set a breakpoint. The single assignment above became line 32 for me, so

break file11673f928501.cpp 32
cont
  • Back in R, call the function:

    fun()

  • Presto, in the debugger at the break point we wanted:

R> fun()

Breakpoint 1, file11673f928501 () at file11673f928501.cpp:32
32      int theAnswer = 42;
(gdb) 
  • Now it is "just" up to you to work gdb to its magic

Now, as I said in my first attempt, all this would be easier (in my eyes) via a simple package which Rcpp.package.skeleton() can write for you as you don't have to deal with randomized directories and filenames. But each to their own...

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • if one was using a package with Rcpp inside, then it is just a matter of changing CXXFLAGS, starting R with R -d gdb, breaking R, setting a breakpoint in the cpp file with gdb and resuming? Isn't it possible to attach to the R process and set the break, without having to break R? Any guides on debugging packages with Rcpp? Many many thanks for Rcpp BTW! – Juancentro Oct 30 '13 at 19:34
  • 1
    With RInside, you C++ program start R for you. You could try attaching to the R process from gdb -- or just use gdb on your outer C++ program. – Dirk Eddelbuettel Oct 30 '13 at 19:59
  • Sorry if I wasn't clear, I'm not talking about RInside. I'm talking about a R package that depends on Rcpp cause it has C++ code in it. – Juancentro Oct 30 '13 at 20:38
  • This answer covers that. If you have a better method, please share it on rcpp-devel or via a post for the Rcpp Gallery. – Dirk Eddelbuettel Oct 30 '13 at 20:43
  • Thanks. I couldnt find ~/.R/ in my Windows installation. I had to resort to declaring an environment variable R_MAKEVARS_USER and pointing it to a makevars I created. This worked, and might be useful to anyone else – Juancentro Oct 30 '13 at 22:25
  • @Juancentro Use `path.expand("~/R")` to get the absolute path for your `Makevars` file. Our just use `usethis::edit_r_makevars()`. I know, six years to late, but it may help others :-) – R Yoda Dec 08 '19 at 14:26
  • 1
    You meant `path.expand("~/.R")`. The dot matters. – Dirk Eddelbuettel Dec 08 '19 at 15:04