2

I recently converted some slow python code into a C extension. It works beautifully, except that it generates a segfault at the 162nd call into it, right at the return statement.

Here's how it works. Once, prior to all of the calls to the function I want to compute, I load the data into memory (remembering to INCREF the parent object):

static PyObject *grm_loadDosage(PyObject *self, PyObject *args) {
 /***
 Load the dosage matrix into global memory
 Global variables: DOSAGES_PYMAT - will be a pointer to the PyArrayObject of the dosage array (must incref this)
                   DOSAGES - will be a pointer to the double array underlying DOSAGES_PYMAT
 ***/
 PyArrayObject *dosageMatrixArr;
 if ( ! PyArg_ParseTuple(args,"O",&DOSAGES_PYMAT_OBJ) ) return NULL;
 if ( NULL == DOSAGES_PYMAT_OBJ ) return NULL;

 Py_INCREF(DOSAGES_PYMAT_OBJ);

 /* from PyObject to Python Array */
 dosageMatrixArr = pymatrix(DOSAGES_PYMAT_OBJ);
 /* get the row and col sizes */
 N_VARIANTS = dosageMatrixArr->dimensions[0];
 N_SAMPLES = dosageMatrixArr->dimensions[1];
 DOSAGES = pymatrix_to_Carray(dosageMatrixArr);
 Py_RETURN_TRUE;
}

(see http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays for the C-array method). I then reference the loaded double[][], DOSAGES in the function I will call from python:

static PyObject *grm_calcdistance(PyObject *self, PyObject *args) {
 /** Given column indeces (samples) of DOSAGES, and an array of row indeces (the variants missing for one or both),
     calculate the distance **/
 int samI,samJ,nMissing, *missing;
 PyObject *missingObj;
 PyArrayObject *missingArr;
 printf("debug1\n");
 if ( ! PyArg_ParseTuple(args,"iiOi",&samI,&samJ,&missingObj,&nMissing) ) return NULL;
 if ( NULL == missingObj ) return NULL;
 missingArr = pyvector(missingObj);
 missing = pyvector_to_Carray(missingArr);
 double replaced1[nMissing];
 double replaced2[nMissing];
 printf("debug2\n");

 int missingVectorIdx;
 int missingVariantIdx;
 // for each sample, store the dosage at the missing site (as it could be missing
 // in the OTHER sample), and replace it with 0.0 in the dosage matrix
 for ( missingVectorIdx = 0; missingVectorIdx < nMissing; missingVectorIdx++ ) {
  printf("debugA: %d < %d\n",missingVectorIdx,nMissing);
  missingVariantIdx = missing[missingVectorIdx];
  replaced1[missingVariantIdx] = DOSAGES[missingVariantIdx][samI];
  replaced2[missingVariantIdx] = DOSAGES[missingVariantIdx][samJ];
  printf("debugB\n");
  DOSAGES[missingVariantIdx][samI]=0.0;
  DOSAGES[missingVariantIdx][samJ]=0.0;
 }

 // calculate the distance (this uses DOSAGES which we just modified)
 double distance = _calcDistance(samI,samJ);

 printf("debug3\n");
 // put the values that we replaced with 0.0 back into the matrix
 for ( missingVectorIdx = 0; missingVectorIdx < nMissing; missingVectorIdx++ ) {
  missingVariantIdx = missing[missingVectorIdx];
  DOSAGES[missingVariantIdx][samI] = replaced1[missingVariantIdx];
  DOSAGES[missingVariantIdx][samJ] = replaced2[missingVariantIdx];
 }
 printf("debug4: %f\n",distance);
 // grab the python object wrapper and return it
 PyObject * distPy = PyFloat_FromDouble((double)distance);
 printf("debug5\n");
 if ( NULL == distPy )
  printf("and is NULL\n");
 return distPy;

}

With tons of debug statements (that you see), I've localized the segfault to the return statement. That is, after the instantiation of the python float object, but somewhere between calling return from C, and the next executed line of python (which is, you guessed it, a print("debugReturned"). What I see in stdout is:

debug4: -0.025160
debug5
Segmentation fault

So the double is not a strange value, the python object is properly created, it's not NULL, but there's some error between returning from C, and continuing in python. Sources online suggest that this might be a INCREF/DECREF problem, but also state that PyFloat_FromDouble() and Py_BuildValue("f",double) generate new references, so don't need to be INCREF'd. Both choices generate this same result. While I'm reasonably certain that I need to INCREF the PyObject holding my matrix during my grm_loadDosage function, I've tried both with and without that INCREF, with the same behavior.

Any ideas what's going on?

Thanks

Also, a stacktrace:

#0  0x0000000000000000 in ?? ()
#1  0x000000000045aa5c in PyEval_EvalFrameEx (f=0x2aaae1ae3f60, throwflag=<value optimized out>) at Python/ceval.c:2515
#2  0x000000000045ecb4 in call_function (f=0x3fb7494970227c55, throwflag=<value optimized out>) at Python/ceval.c:4009
#3  PyEval_EvalFrameEx (f=0x3fb7494970227c55, throwflag=<value optimized out>) at Python/ceval.c:2692
#4  0x000000000045ecb4 in call_function (f=0x95c880, throwflag=<value optimized out>) at Python/ceval.c:4009
#5  PyEval_EvalFrameEx (f=0x95c880, throwflag=<value optimized out>) at Python/ceval.c:2692
#6  0x000000000045f626 in PyEval_EvalCodeEx (_co=0x98abe0, globals=<value optimized out>, locals=<value optimized out>, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0,
    closure=0x0) at Python/ceval.c:3350
#7  0x000000000045f74b in PyEval_EvalCode (co=0x146b098, globals=0x71, locals=0xc7) at Python/ceval.c:767
#8  0x0000000000482fab in run_mod (fp=0x881b80, filename=0x2aaaae257de0 "/humgen/gsa-hphome1/chartl/projects/t2d/gcta/resources/bin/cGRM/calculateGRM.py", start=<value optimized out>,
    globals=0x81e340, locals=0x81e340, closeit=1, flags=0x7fffffffbfd0) at Python/pythonrun.c:1783
#9  PyRun_FileExFlags (fp=0x881b80, filename=0x2aaaae257de0 "/humgen/gsa-hphome1/chartl/projects/t2d/gcta/resources/bin/cGRM/calculateGRM.py", start=<value optimized out>, globals=0x81e340,
    locals=0x81e340, closeit=1, flags=0x7fffffffbfd0) at Python/pythonrun.c:1740
#10 0x0000000000483268 in PyRun_SimpleFileExFlags (fp=<value optimized out>, filename=0x2aaaae257de0 "/humgen/gsa-hphome1/chartl/projects/t2d/gcta/resources/bin/cGRM/calculateGRM.py", closeit=1,
    flags=0x7fffffffbfd0) at Python/pythonrun.c:1265
#11 0x00000000004964d7 in run_file (argc=<value optimized out>, argv=0x7df010) at Modules/main.c:297
#12 Py_Main (argc=<value optimized out>, argv=0x7df010) at Modules/main.c:692
#13 0x000000000041563e in main (argc=11, argv=0x7fffffffc148) at ./Modules/python.c:59
  • 1
    I don't see anything obviously wrong. You should help us reproduce the problem locally by posting the complete C source and a Python example. – Armin Rigo Oct 26 '12 at 08:21

1 Answers1

1

I would recommend to try and run valgrind against your code, see How can I use valgrind with Python C++ extensions? for how to do that with python, I am not sure how useful that exeoption list is for python 3, though. Anyways, ignore all outputs that come from parts without any of your files part of it.

If you are using Windows I would recommend one of these Is there a good Valgrind substitute for Windows?

The Debugging that you did tells me that the error occurs after your function returns. I don't see why the return statement itself should be problematic. The source of the error according to your stacktrace lies in some code of python-3 itself. I Assume that there is no bug in python-3 itself. You could try to install a different version of python-3 to exclude this even further. That is why I assume that you corrpted the stack or heap, that is what valgrind comes in handy for.

Community
  • 1
  • 1
H. Brandsmeier
  • 957
  • 5
  • 19