6

Below is part of a C code I wrote. The function foo is to be called in R. The code keeps causing R to crash, and I narrowed down the problem to this outer() function, which is used to compute outer sum or difference. Note the part that is commented out: If I do not comment it out, the function will lead R to crash if each of the arrays contains, say, over 1000 data points. If I comment it out, I can compute outer sum/difference for significantly longer arrays with no problem (e.g, over 100000 data points per array). I wonder what the problem is... Thank you!

#include <R.h>
#include <Rmath.h>
#include <stdio.h>
#include <math.h>
#include <stdlib.h>

void outer(double *x1, double *x2, int *n, int operation, double *output){
int i, j;
if(operation==1){
    for(i=0; i<*n; i++){
        for(j=0; j<*n; j++){
            output[(*n)*i+j]=x1[j]+x2[i];
        }
    }
} else if(operation==2){
    for(i=0; i<*n; i++){
        for(j=0; j<*n; j++){
            output[(*n)*i+j]=x1[j]-x2[i];
            //Rprintf("%d ", (*n)*i+j); //<-----------HERE
        }
    }
} 
}


void foo(double *x, double *y, int *npred, int *nsamp){
int oper=2;
double xouter[*nsamp], youter[*nsamp];
double outer_temp_x[(*nsamp)*(*nsamp)], outer_temp_y[(*nsamp)*(*nsamp)];

outer(x, x, nsamp, oper, &outer_temp_x[0]);
outer(y, y, nsamp, oper, &outer_temp_y[0]);

}

//After compiling the code, I use the code below in R to call the function:

dyn.load("foo.so")
x=as.matrix(rnorm(10000))
y=rlnorm(10000)

invisible(.C("foo", 
             x=as.double(as.vector(x)), 
             y=as.double(y), 
             npred=as.integer(ncol(x)), 
             nsamp=as.integer(length(y))
          )
Alex
  • 4,030
  • 8
  • 40
  • 62
  • This crashes R for me, with the `Rprintf` commented out. – Matthew Lundberg May 07 '13 at 05:03
  • Uh. That's really weird. I tried it many times, and it did not crash R when Rprintf was commented out. Let me try it again.. – Alex May 07 '13 at 05:04
  • Just tried it again. It worked with no problem. Really weird. – Alex May 07 '13 at 05:06
  • @MatthewLundberg: what was the size of your arrays when it crashed R? – Alex May 07 '13 at 05:08
  • 10000, as in your example. – Matthew Lundberg May 07 '13 at 05:08
  • @jdigital: Apparently Matthew's R crashed even with `Rprintf` commented out.And I tried `"%d \n", R still crashed. – Alex May 07 '13 at 05:11
  • It crashes for me (with the `Rprintf` commented out) if I give it length 724 or above. That will allocate 8M for those arrays. It does not crash (at length 10000) with the arrays allocated on the heap. This is Linux 64-bit. I suspect that you have a 16M stack size and you're overrunning it, causing "undefined behavior." – Matthew Lundberg May 07 '13 at 05:14
  • Is there any way to debug it? I have a Mac (2011 MacBook Air). The crash report is esoteric at best... – Alex May 07 '13 at 05:21
  • There's always a way to debug it. Build `foo.so` without optimization and with debug info, attach to R with the debugger, then call the function. The fault should break into the debugger instead of crashing. – Matthew Lundberg May 07 '13 at 05:25

1 Answers1

7

I think it is overunning the stack and causing trouble.

Try this:

void foo(double *x, double *y, int *npred, int *nsamp){
  int oper=2;
  double xouter[*nsamp], youter[*nsamp];

  // The prior code allocated on the stack.  Here, we make a pair of calls
  // to 'malloc' to allocate memory for the arrays.  This gets memory from
  // the heap.  The stack is fairly limited, but the heap is huge.
  // 'malloc' returns a pointer to the allocated memory.

  double* outer_temp_x=malloc(sizeof(double)*(*nsamp)*(*nsamp));
  double* outer_temp_y=malloc(sizeof(double)*(*nsamp)*(*nsamp));

  outer(x, x, nsamp, oper, &outer_temp_x[0]);
  outer(y, y, nsamp, oper, &outer_temp_y[0]);

  // The downside of allocating on the heap, is that you must release the
  // memory at some point.  Otherwise you have what's called a "memory leak."
  // 'free' is the function to free the memory, and it is called on the
  // pointer value returned by 'malloc'.

  free(outer_temp_x);
  free(outer_temp_y);
}
Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112
  • Did you only add those lines, or also add the calls to `malloc` ? – Matthew Lundberg May 07 '13 at 05:28
  • Could you explain `double* outer_temp_x=malloc(sizeof(double)*(*nsamp)*(*nsamp))` and `double* outer_temp_y=malloc(sizeof(double)*(*nsamp)*(*nsamp))`? Your code worked. Thanks!! – Alex May 07 '13 at 05:30
  • 1
    [Writing R Extensions](http://cran.r-project.org/doc/manuals/R-exts.html#Memory-allocation) points to use of `R_alloc` for dynamic memory allocation (automatically retrieved on return to R, no need to `free`) and to `Calloc` / `Free` for consistency across platforms. – Martin Morgan May 07 '13 at 13:40
  • Thank you, @MartinMorgan. Here these are not being returned (those are clearly temporaries) so it isn't an issue. – Matthew Lundberg May 07 '13 at 13:42