0

Context and formulation of a question

Hello everyone. I have a cpp program that calls python script multiple times in a loop. Everything works fine unless I add Py_Finalize() at the end of the function. In that case at the second iteration of the loop the program will try to extract value from PyObject* result variable which fro some reason appears to be nullptr in debugger at that point and then terminate itself.

Py_REFCNT() outputs consistent results for all objects throughout multiple executions of a function in the loop. By that I mean that reference count for all Pu_Object stays the same in each execution. You can spectate this by uncommenting std::cout lines in get_play_mrl(). However if you uncomment Py_Finalize() at the second iteration PyUnicode_AsWideCharString() will attempt to access result which at that moment would have no references and would be nullptr thus crashing the program.

Question: why Py_Finalize() causes this? How does result become nullptr if it is declared in the same function?

Side question 1: Can something bad happen if I call Py_Initialize() in a loop without matching each with Py_Finalize()?

Side question 2: I've noticed that some objects after being declared in code only once have more than single reference. Why it happens? Is Py_SET_REFCNT() the best way to dereference objects in such cases?

Python script that is being called from C++ code

#!/usr/bin/python3

import pafy
import youtube_dl

def get_play_url(url):
    song = pafy.new(url)
    duration = song.length
    audiostreams = song.audiostreams
    best = song.getbest()
    play_url = best.url

    return play_url

C++ code

// all commented lines besides one are used for debugging purposes

std::string get_play_mrl (
    const std::string mrl
    )
{
    Py_Initialize();
    
    PyObject* py_module_name = PyUnicode_FromString((char*)"get_str");     
    PyObject* py_module      = PyImport_Import( py_module_name );  

    if (!py_module)
    {
        std::cout << "Error importing module.\n";
        return "Error";
    }

    PyObject* function = PyObject_GetAttrString(py_module, (char*)"get_play_url");  
    
    const char* mrl_c = mrl.c_str();                                                
    PyObject* args = PyTuple_Pack(1, PyUnicode_FromString(mrl_c));       
    
    PyObject* result = PyObject_CallObject(function, args);                         
    
    //std::cout << "Module: " << Py_REFCNT(py_module) << "\nFunction: " << Py_REFCNT(function) << "\nResult: " << Py_REFCNT(result) << "\n"; 
    Py_ssize_t* size = nullptr; 
    wchar_t* play_mrl_wchar = PyUnicode_AsWideCharString(result, size);

    std::wstring ws( play_mrl_wchar );
    std::string play_mrl_str( ws.begin(), ws.end() );
    delete [] play_mrl_wchar; // btw am I freeing memory in correct way here? 
    
    Py_XDECREF(py_module_name);
    Py_XDECREF(py_module);
    Py_XDECREF(function);
    Py_XDECREF(args);
    Py_XDECREF(result);

    // uncommenting this will result in program crash at the second iteration of the loop
    //Py_Finalize(); 
    
    //std::cout << "Module: " << Py_REFCNT(py_module) << "\nFunction: " << Py_REFCNT(function) << "\nResult: " << Py_REFCNT(result) << "\n";
    
    return play_mrl_str;
}

int main()
{
    std::vector<std::string> url_vec = {/* Some links to short YouTube videos */};    

    for (auto a : url_vec)
    {
        std::string mrl = get_play_mrl(a);
        // doing something with mrl...
    }
    
    return 0;
}

CMake file for building a project in case someone needs it

cmake_minimum_required(VERSION 3.18)

project(
    py_to_cxx
        LANGUAGES CXX C 
        )

set(PYTHON_HEADERS /usr/include/python3.9 )
set(PYTHON_STATIC  /usr/lib64             )

add_library(python_embeded SHARED IMPORTED GLOBAL)
set_target_properties(python_embeded PROPERTIES
    IMPORTED_LOCATION ${PYTHON_STATIC}/libpython3.9.so
    INTERFACE_INCLUDE_DIRECTORIES ${PYTHON_HEADERS}
    LINKER_LANGUAGE C
    )

       add_executable(py_to_cxx main.cpp)
target_link_libraries(py_to_cxx PRIVATE python_embeded)
trofchik
  • 87
  • 1
  • 7
  • 1
    Re: "am I freeing memory in correct way here?". Answer is no. See the [docs](https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_AsWideCharString) -- use https://docs.python.org/3/c-api/memory.html#c.PyMem_Free Could be why Py_Finalize is causing problems. – Dunes May 30 '21 at 14:25
  • 1
    Are you experimenting with a using a python interpreter in C++ or do you just want to call a python function from C++. If the latter, if might be easier to create a child process with the args something like `python3 -c 'import get_str; print(get_str.get_play_url("%s"))'` and then just check the return value and capture stdout. – Dunes May 30 '21 at 14:32
  • 1
    Lots of extension modules don't work right if `Py_Finalize` is called more than once (Numpy is one of them) - this is in [the documentation for `Py_Finalize`](https://docs.python.org/3.5/c-api/init.html#c.Py_Finalize). I can't find a good duplicate right now but it's a fairly common issue and the solution is to call it only once – DavidW May 31 '21 at 19:16
  • 1
    https://stackoverflow.com/questions/10206833/embedding-python-and-running-for-multiple-times https://stackoverflow.com/questions/7676314/py-initialize-py-finalize-not-working-twice-with-numpy for example – DavidW May 31 '21 at 19:18
  • 1
    For "Side question 1" - calling `Py_Initialize` multiple times is a no-op and is fine (it's in the documentation). For "side question 2" multiple places can and will refer to the same object. You should not call `Py_SET_REFCNT` because you will end up freeing an object that someone else is using. Just `Py_DECREF` them as normal and they will be cleaned up when no references remain. – DavidW May 31 '21 at 19:22

0 Answers0