2

Summary: I am getting a segfault when calling an embedded Python script, which somehow only happens when the script is called the second time around.

Hello there! I am quite a beginner, and this is my first time posing a question on Stack Overflow, so please do correct me if make mistakes.

I am trying to call a Python script from C++ that returns a tuple of data. My C++ program is currently able to parse the tuple into its components (two integers, "ultra_dist" and "gas_density"). I can call getUltraNGasData() once, and there is no problem in collecting the data.

The problem seems to arise when I attempt to call the function the second time from within the loop in the main() function. The program crashes and I get an exception: "unhandled Signal 11, program terminating".

Here is the C++ program (abridged, not full). This should compile:

#include <cstdio>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <pthread.h>
#include <Python.h>
#include <stdlib.h>

void getUltraNGasData(int* ultra_dist, int* gas_density){
    // Initialize the Python interpreter.
    Py_Initialize();
    // Create some Python objects that will later be assigned values.
    PyObject *pName, *pModule, *pDict, *pFunc;
    // Set Python path
    PyRun_SimpleString("import sys");
    PyRun_SimpleString("sys.path.append('/home/pi/Desktop')");

    // Convert the file name to a Python string.
    pName = PyString_FromString("GetData_Ultra+Gas");
    if(pName == NULL){
            printf("Mispelled script name?\n");
    exit(-1);
    }

    // Import the file as a Python module.
    pModule = PyImport_Import(pName);
    if(pModule == NULL){
        printf("No Python script found!\n");
    exit(-1);
    }

    // Create a dictionary for the contents of the module.
    pDict = PyModule_GetDict(pModule);
    // Get the add method from the dictionary.
    pFunc = PyDict_GetItemString(pDict, "main");

    Py_INCREF(pName);
    Py_INCREF(pModule);
    Py_INCREF(pDict);
    Py_INCREF(pFunc);

    // Call the function with the arguments.
    PyObject* pResult = PyObject_CallObject(pFunc, NULL);
    // Print a message if calling the method failed.
    if(pResult == NULL)   printf("Method failed.\n");

    //Parse values
    PyObject* tupleItem1 = PyTuple_GetItem(pResult,0);
    if(tupleItem1 == NULL){
        printf("No ultrasonic distance return value from Python script.\n");
        exit(-2);
    }else{
        *ultra_dist=(int) PyInt_AsLong(tupleItem1);
    }
    PyObject* tupleItem2 = PyTuple_GetItem(pResult,1);
    if(tupleItem2 == NULL){
        printf("No gas sensor measurement return value from Python script.\n");
        exit(-2);
    }else{
        *gas_density=(int) PyInt_AsLong(tupleItem2);
    }


    // Destroy the Python interpreter.
    Py_CLEAR(pModule);
    Py_CLEAR(pName);
    Py_CLEAR(pDict);
    Py_CLEAR(pFunc);
    Py_CLEAR(pResult);
    Py_CLEAR(tupleItem1);
    Py_CLEAR(tupleItem2);
    Py_Finalize();

    printf("Ultra_dist = %d, Gas_density = %d\n",*ultra_dist,*gas_density);
}


int main(){
    int ultra_dist=0;       //Ultrasonic measured range (in cm)
    int gas_density=0;      //Estimate of gas density (%)

    while(1){
        getUltraNGasData(&ultra_dist,&gas_density); //Get measurements from gas and ultrasonic sensors by calling Python script
    }
}

And here is the full Python script being called, which reads data from an ADC using bitbanging of SPI. I would abridge this to just returning two made-up values, but it seems that the error disappears if I do so - meaning it is somehow related to the whole of this Python script:

#!/usr/bin/env python2.7
# adapted from script by Alex Eames http://RasPi.tv
import time
import os
import subprocess
import smtplib
import string
import RPi.GPIO as GPIO
from time import gmtime, strftime

GPIO.setmode(GPIO.BCM)

#DEFINITIONS
#ADC channels
ULTRA_ADC_CH = 0
GAS_ADC_CH = 1

#Other definitions
adc_ch = [ULTRA_ADC_CH,GAS_ADC_CH] # Particular ADC channel to read
reps = 10 # how many times to take each measurement for averaging
time_between_readings = 1 # seconds between clusters of readings
V_REF = 5.0 #reference voltage
ultra_conv_factor=1./(0.0098/2.54)      #Ultrasonic factor: cm/mV


# Define Pins/Ports
SPICLK = 22           # FOUR SPI ports on the ADC 
SPIMISO = 5
SPIMOSI = 6
SPICS = 26
SPICS2 = 16


# read SPI data from MCP3002 chip, 2 possible adc channels (0 & 1)
# this uses a bitbang method rather than Pi hardware spi
def readadc(adc_ch, clockpin, mosipin, misopin, cspin):
    if ((adc_ch > 1) or (adc_ch < 0)):
        return -1
    if (adc_ch == 0):
        commandout = 0x6
    else:
        commandout = 0x7

    GPIO.output(cspin, True)

    GPIO.output(clockpin, False)  # start clock low
    GPIO.output(cspin, False)     # bring CS low

    commandout <<= 5    # we only need to send 3 bits here
    for i in range(3):
        if (commandout & 0x80):
            GPIO.output(mosipin, True)
        else:   
            GPIO.output(mosipin, False)
        commandout <<= 1
        GPIO.output(clockpin, True)
        GPIO.output(clockpin, False)

    adcout = 0
    # read in one empty bit, one null bit and 10 ADC bits
    for i in range(12):
        GPIO.output(clockpin, True)
        GPIO.output(clockpin, False)
        adcout <<= 1
        if (GPIO.input(misopin)):
            adcout |= 0x1

    GPIO.output(cspin, True)

    adcout /= 2       # first bit is 'null' so drop it
    return adcout



#MAIN FUNCTION: to get ultrasonic and gas sensor data

def main():

    reps=10
    #Set up ports
    GPIO.setup(SPIMOSI, GPIO.OUT)       # set up the SPI interface pins
    GPIO.setup(SPIMISO, GPIO.IN)
    GPIO.setup(SPICLK, GPIO.OUT)
    GPIO.setup(SPICS, GPIO.OUT)

    try:
            for adc_channel in adc_ch:      #adc_ch is the channel number
                adctot = 0
                # read the analog value
                for i in range(reps):       #Read same ADC repeatedly for # REPS
                    read_adc = readadc(adc_channel, SPICLK, SPIMOSI, SPIMISO, SPICS)
                    adctot += read_adc
                    #time.sleep(0.01)            #Minimum 11.5us limit for acquisition & conversion
                read_adc = adctot / reps / 1.0 # Take average value
                # print read_adc

                # convert analog reading to Volts = ADC * ( V_REF / 1024 )
                volts = read_adc * ( V_REF / 1024.0)
                # convert voltage to measurement
                if (adc_channel==ULTRA_ADC_CH): 
                    ultra_dist = volts * ultra_conv_factor
                    if ultra_dist < 50:         # Filtering to reduce effect of noise below 50cm
                        reps = 100
                    else:
                        reps = 10
                    #print "\nUltrasonic distance: %d" %ultra_dist
                elif (adc_channel==GAS_ADC_CH): 
                    gas_val_percent = volts / V_REF *100
                    #print "Gas density: %d\n" %gas_val_percent

            #time.sleep(time_between_readings)

    except KeyboardInterrupt:             # trap a CTRL+C keyboard interrupt
        GPIO.cleanup()
    GPIO.cleanup()

    return (ultra_dist, gas_val_percent)

# To call main function
if __name__ == '__main__':
        main()

Sorry for the very long code block! (You can ignore the functionality of the SPI, though - it is working and is not relevant to the issue.)

From what I've found, Signal 11 is just a segfault, so it doesn't tell me much about the problem.

I originally guessed that the issue has to do with memory, which is why I have added the Py_INCREF and Py_CLEAR lines when using the Python C++ API. When I comment out the CLEARs, the segfaults shifts from line PyImport_Import() to PyObject_CallObject(). (I know this from setting breakpoints before and after, since backtracing in gdb didn't seem to be very informative.) However, I have no idea what the root cause may be just various signs such as the length of the script called and this reference counting.

I have looked through SO about similar segfault cases (e.g. https://stackoverflow.com/questions/16207457/pyobject-segfault-on-function-call#=), but it seems to be very unique to the program. Plus, not much is mentioned in the API about debugging. This problem has been bothering me for several hours now, so I would appreciate any help I can get.

Thank you!

(FYI: I am running this on a Raspberry Pi using a modified OS image (Erle "Frambuesa").However, I can certainly check if I get the same issue on Raspbian, in case it makes any difference.)

Community
  • 1
  • 1
alokd
  • 21
  • 5
  • 3
    The right tool to solve such problems is your debugger. You should step through your code line-by-line *before* asking on Stack Overflow. For more help, please read [How to debug small programs (by Eric Lippert)](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/). At a minimum, you should \[edit] your question to include a [Minimal, Complete, and Verifiable](http://stackoverflow.com/help/mcve) example that reproduces your problem, along with the observations you made in the debugger. – πάντα ῥεῖ Mar 13 '17 at 08:14
  • 1
    Please learn how to use a debugger, and how to run your program from inside one. If you run a debug-build of your program inside a debugger, the debugger will catch the crash "in action" stopping at the location of the crash. From there you will be able to locate where in your code it happens, and also be able to examine variables and their values to verify that they are okay. – Some programmer dude Mar 13 '17 at 08:15
  • 1
    after if(pResult==NULL) you construct PyObject* tupleItem1 = PyTuple_GetItem(pResult,0); maybe it have something to do with your segmentation fault. but clearly the best way would be to use a debugger to find out why. – Venom Mar 13 '17 at 08:20
  • Your library is static not an service or app. Call source as library (ctypes) , offcourse required convert to `.so` file. – dsgdfg Mar 13 '17 at 08:26
  • a segfault is pretty much the same as NullPointerException, whenever those occurs I suggest looking at every line with the format `someobj.somefieldormethod` or in C++ `somepointer->somefieldormethod` and think "could someobj be null here ? why ? " – niceman Mar 13 '17 at 08:32
  • @πάνταῥεῖ OP did use the debugger :) – niceman Mar 13 '17 at 08:36
  • hmmm I'm not much expert in python but `Py_Finalize()` should be called in the end of main not inside the loop I guess – niceman Mar 13 '17 at 08:43
  • and `Py_Initialize` should be the first statement in `main` instead of calling it inside the loop – niceman Mar 13 '17 at 08:44
  • Can you please try running the program after replacing your python script and tell us if you still get the segfault: `#!/usr/bin/env python2.7 def main(): return (1, 2)` – cfromme Mar 13 '17 at 08:50
  • Thank you, all, for the comments! I apologize that I was not clear earlier, but I actually had tried with the debugger first, which is how I originally found the location of the segfault. My problem was that I was unable to interpret the results of the backtrace, and that after attempting to view the contents of variables at the point of the segfault (info locals) from a core dump, I kept getting an error of there being no symbol table available. – alokd Mar 13 '17 at 09:47
  • As well, sorry for having posted the full python script instead of just the basic return (as cfromme says). It just so happened that the problem was only reproducible with the full script, as I had mentioned. @niceman: I tried moving the Initialize and Finalize statements out of the loop and found that the segmentation fault became a NullPointerException. Accordingly it was caught by the program (pResult=NULL). Most helpful, thank you! I think may be able to take it from here, but will let you know if there are any notable updates/issues remaining. – alokd Mar 13 '17 at 09:48
  • @cfromme: I restarted my machine and then tried what you suggested again. This time the program runs for ~30 iterations of the while loop until the following is returned. "Fatal Python error: non-string found in code slot". I have yet to look into this as I am still testing with the original script, but can try afterwards. – alokd Mar 13 '17 at 09:48
  • @alokd I believe the error is caused by something in the python script, because the code runs fine with above script here. – cfromme Mar 13 '17 at 09:59

0 Answers0