4

I would like to be able to use my own memory allocation function for certain data structures (real valued vectors and arrays) in R. The reason for this is that I need my data to be 64bit aligned and I would like to use the numa library for having control over which memory node is used (I'm working on compute nodes with four 12-core AMD Opteron 6174 CPUs).

Now I have two functions for allocating and freeing memory: numa_alloc_onnode and numa_free (courtesy of this thread). I'm using R version 3.1.1, so I have access to the function allocVector3 (src/main/memory.c), which seems to me as the intended way of adding a custom memory allocator. I also found the struct R_allocator in src/include/R_ext

However it is not clear to me how to put these pieces together. Let's say, in R, I want the result res of an evaluation such as

res <- Y - mean(Y)

to be saved in a memory area allocated with my own function, how would I do this? Can I integrate allocVector3 directly at the R level? I assume I have to go through the R-C interface. As far as I know, I cannot just return a pointer to the allocated area, but have to pass the result as an argument. So in R I call something like

n <- length(Y)
res <- numeric(length=1)
.Call("R_allocate_using_myalloc", n, res)
res <- Y - mean(Y)

and in C

#include <R.h>
#include <Rinternals.h>
#include <numa.h>

SEXP R_allocate_using_myalloc(SEXP R_n, SEXP R_res){
  PROTECT(R_n = coerceVector(R_n, INTSXP));
  PROTECT(R_res = coerceVector(R_res, REALSXP));
  int *restrict n = INTEGER(R_n);

  R_allocator_t myAllocator;
  myAllocator.mem_alloc = numa_alloc_onnode;
  myAllocator.mem_free = numa_free;
  myAllocator.res = NULL;
  myAllocator.data = ???;

  R_res = allocVector3(REALSXP, n, myAllocator);

  UNPROTECT(2);
}

Unfortunately I cannot get beyond a variable has incomplete type 'R_allocator_t' compilation error (I had to remove the .data line since I have no clue as to what I should put there). Does any of the above code make sense? Is there an easier way of achieving what I want to? It seems a bit odd to have to allocate a small vector in R and the change its location in C just to be able to both control the memory allocation and have the vector available in R...

I'm trying to avoid using Rcpp, as I'm modifying a fairly large package and do not want to convert all C calls and thought that mixing different C interfaces could perform sub-optimally.

Any help is greatly appreciated.

Community
  • 1
  • 1
nbenn
  • 591
  • 4
  • 12
  • That is a conjecture: _"I'm trying to avoid using Rcpp, as I'm modifying a fairly large package and do not want to convert all C calls and thought that mixing different C interfaces could perform sub-optimally."_ Please demonstrate empirically that Rcpp makes your code slower. – Dirk Eddelbuettel Oct 22 '14 at 15:13
  • I'm sorry, I didn't want to offend anyone, nor did I want to imply that using Rcpp in this case is in any way a bad idea. If anyone has an idea of how to solve my problem using Rcpp, I'll gladly try it out. Perhaps it would have been better to phrase the last section as: "I haven't looked at Rcpp because I'm modifying a fairly large package which doesn't use Rcpp." – nbenn Oct 22 '14 at 15:23
  • Change is incremental. You _could_ just add a single (new) function without requiring _any_ change to the rest of your package. – Dirk Eddelbuettel Oct 22 '14 at 15:28
  • I tried solving my problem with Rcpp, but got a `long vectors not supported yet` error. I'm using `Rcpp_0.11.3`. Am I doing something wrong or are long vectors actually not yet supported? – nbenn Nov 07 '14 at 16:12

2 Answers2

2

I made some progress in solving my problem and I would like to share in case anyone else encounters a similar situation. Thanks to Kevin for his comment. I was missing the include statement he mentions. Unfortunately this was only one among many problems.

dyn.load("myAlloc.so")

size <- 3e9
myBigmat <- .Call("myAllocC", size)
print(object.size(myBigmat), units = "auto")

rm(myBigmat)
#include <R.h>
#include <Rinternals.h>
#include <R_ext/Rallocators.h>
#include <numa.h>

typedef struct allocator_data {
  size_t size;
} allocator_data;

void* my_alloc(R_allocator_t *allocator, size_t size) {
  ((allocator_data*)allocator->data)->size = size;
  return (void*) numa_alloc_local(size);
}

void my_free(R_allocator_t *allocator, void * addr) {
  size_t size = ((allocator_data*)allocator->data)->size;
  numa_free(addr, size);
}

SEXP myAllocC(SEXP a) {
  allocator_data* my_allocator_data = malloc(sizeof(allocator_data));
  my_allocator_data->size = 0;

  R_allocator_t* my_allocator = malloc(sizeof(R_allocator_t));
  my_allocator->mem_alloc = &my_alloc;
  my_allocator->mem_free = &my_free;
  my_allocator->res = NULL;
  my_allocator->data = my_allocator_data;

  R_xlen_t n = asReal(a);
  SEXP result = PROTECT(allocVector3(REALSXP, n, my_allocator));
  UNPROTECT(1);
  return result;
}

For compiling the c code, I use R CMD SHLIB -std=c99 -L/usr/lib64 -lnuma myAlloc.c. As far as I can tell, this works fine. If anyone has improvements/corrections to offer, I'd be happy to include them.

One requirement from the original question that remains unresolved is the alignment issue. The block of memory returned by numa_alloc_local is correctly aligned, but other fields of the new VECTOR_SEXPREC (eg. the sxpinfo_struct header) push back the start of the data array. Is it somehow possible to align this starting point (the address returned by REAL())?

nbenn
  • 591
  • 4
  • 12
1

R has, in memory.c:

main/memory.c
84:#include <R_ext/Rallocators.h> /* for R_allocator_t structure */

so I think you need to include that header as well to get the custom allocator (RInternals.h merely declares it, without defining the struct or including that header)

Kevin Ushey
  • 20,530
  • 5
  • 56
  • 88