Can R extension safely allocate memory when it comes to exceptional conditions?

Question

I am about to write an extension package for R in C++ and wonder how dynamic memory management is intended to be used without risk of memory leaks. I have read

and immediately get to three questions:

Does R gracefully unwind the C++ stack frame in case of R-exceptions, e.g. when R_alloc runs out of memory or Rf_error is called due to some other condition? – Otherwise, how am I supposed to clean up already R_alloc'ed and PROTECTed or simply Calloc'ed memory? For example, will
```
#include<R.h>
// […]
void someMethod () {
  char *buffer1 = NULL;
  char *buffer2 = NULL;
  try {
    ClassA a;
    buffer1 = R_Calloc( 10000, char );
    buffer2 = R_Calloc( 10000, char );
    // […]
  } finally {
    try {
      if ( NULL != buffer1 ) {
        R_Free( buffer1 );
      }
    } finally {
      if ( NULL != buffer2 ) {
        R_Free( buffer2 );
      }
    }
  }
}
```
guarantee to call the destructor ~ClassA for a and R_Free for buffer1 and buffer2? And if not, what would be the R textbook way to guarantee that?
Could standard C++ (nowadays deprecated) std::auto_ptr or modern std::unique_ptr be employed to simplify the memory allocation idiom?
Is there a proven C++ idiom/best practice to use R's memory allocation in the C++ standard template library, e.g. some suitable allocator template, so that STL classes allocate their memory from the R heap?

The issue probably also concerns `Rf_warning`, see https://stackoverflow.com/questions/24557711/how-to-generate-an-r-warning-safely-in-rcpp — Bernhard Bodenstorfer, Jun 22 '15 at 05:29
One solution I see is to write my own garbage collector, e.g. in the form of a wrapper around `R_alloc` and its friends which immediately registers the pointers to allocated memory with some global memory management object, which could at least release the leaked memory upon package unload. But I have some hope that there is a better practice than this available. — Bernhard Bodenstorfer, Jun 22 '15 at 05:42
Oops, I just realise that I borrowed some syntax from Java: `finally` blocks guaranteed to be executed after the respective `try` blocks are not available in C++, which relies on “resource acquisition is initialization” (RAII) for this purpose. Anyway, I think the idea of the question remains unaffected. — Bernhard Bodenstorfer, Jun 30 '18 at 06:44

score 0 · Answer 1 · answered Jul 06 '15 at 11:11

Since Rf_error will indeed skip the C++ stack frame and thus bypass destructor calls, I found it necessary to undertake more documentation research. In particular a look into the RODBC package and experimentation monitoring memory use to confirm the findings, made me arrive at:

1: Immediately store pointer in an R external pointer and register a finaliser for that.

The idiom is illustrated in the following somewhat simplistic example:

#define STRICT_R_HEADERS    true

#include <string>
#include <R.h>
#include <Rinternals.h>     // defines SEXP

using namespace std;

class A {
    string name;
    public:
    A ( const char * const name ) : name( name ) { Rprintf( "Construct %s\n", name ); }
    ~A () { Rprintf( "Destruct %s\n", name.c_str() ); }
    const char* whoami () const { return name.c_str(); }
};

extern "C" {
    void finaliseAhandle ( SEXP handle ) {
        A* pointer = static_cast<A*>( R_ExternalPtrAddr( handle ) );
        if ( NULL != pointer ) {
            pointer->~A();
            R_Free( pointer );
            R_ClearExternalPtr( handle );
        }
    }

    SEXP createAhandle ( const SEXP name ) {
        A* pointer = R_Calloc( 1, A );
        SEXP handle = PROTECT( R_MakeExternalPtr(
            pointer,
            R_NilValue, // for this simple example no use of tag and prot
            R_NilValue
        ) );
        try {
            new(pointer) A( CHAR( STRING_ELT( name, 0 ) ) );
            R_RegisterCFinalizerEx( handle, finaliseAhandle, TRUE );
        } catch (...) {
            R_Free( pointer );
            R_ClearExternalPtr( handle );
            Rf_error( "construction of A(\"%s\") failed", CHAR( STRING_ELT( name, 0 ) ) );
        }
        // … more code may follow here, including calls to Rf_error.
        UNPROTECT(1);
        return handle;
    }

    SEXP nameAhandle ( const SEXP handle ) {
        A* pointer = static_cast<A*>( R_ExternalPtrAddr( handle ) );
        if( NULL != pointer ) {
            return mkChar( pointer->whoami() );
        }
        return R_NilValue;
    }

    SEXP destroyAhandle ( const SEXP handle ) {
        if( NULL != R_ExternalPtrAddr( handle ) ) {
            finaliseAhandle( handle );
        }
        return R_NilValue;
    }
}

The assignment of NULL to the pointer in R_ClearExternalPtr( handle ); prevents double calling of R_Free( pointer );`.

Mind that there is still some assumption needed for the suggested idiom to safely work: If the constructor must not fail in the sense of R, i.e. by calling Rf_error. If this cannot be avoided, my advice would be to postpone the constructor invocation to after the finaliser registration so that the finaliser will in any case be able to R_Free the memory. However, logic must be included in order not to call the destructor ~A unless the A object has been validly constructed. In easy cases, e.g. when A comprises only primitive fields, this may not be an issue, but in more complicated cases, I suggest to wrap A into a struct which can then remember whether the A constructor completed successfully, and then allocate memory for that struct. Of course, we must still rely on the A constructor to gracefully fail, freeing all memory it had allocated, regardless of whether this was done by C_alloc or malloc or the like. (Experimentation showed that memory from R_alloc is automatically freed in case of Rf_error.)

2: No.

Neither class has anything to do with registering R external pointer finalisers.

3: Yes.

As far as I have seen, it is considered best practice to cleanly separate the reigns of C++ and R. Rcpp encourages the use of wrappers (https://stat.ethz.ch/pipermail/r-devel/2010-May/057387.html, cxxfunction in http://dirk.eddelbuettel.com/code/rcpp.html) so that C++ exceptions will not hit the R engine.

In my opinion, an allocator could be programmed to use R_Calloc and R_Free. However, to counter the effects of potential Rf_error during such calls, the allocator would require some interface to garbage collection. I imagine locally tying the allocator to a PROTECTed SEXP of type externalptr which has a finaliser registered by R_RegisterCFinalizerEx and points to a local memory manager which can free memory in case of Rf_error.

Can R extension safely allocate memory when it comes to exceptional conditions?

1 Answers1