0

I started off today wondering whether it is possible to save a python object for use in a C program, a proposition which, after many hours of reading looks naive. Here is a possible workaround:

1. Create a complex object dependent on many python libraries with data inside I need preserved.
2. Pickle the complex object and place it where it will be accessible.
3. Define compileme.py:

import pickle
thing = pickle.load(open('thing.pkl', 'r'))# an object with a method query(),
                                           # which takes a numpy array as input

4. cython --embed -o compileme.c compileme.py to generate a .c version of the script.
5. Define main.c:

#include <stdio.h>
#include//(A) something from compileme

int main(void) {
    input = //(B) query takes a numpy array in python. Define something palatable.
    double result = thing.query(input);
    printf("%d", result);
}

6. Compile main.c properly, with all the right linkages.

It is not clear to me this basic solution strategy is sound, and I have a number of concerns:

  1. The thing is of a class from a library not even mentioned here, so its query() method depends on that external python. How can I ensure the relevant parts are also being compiled and linked?
  2. How should I include compileme in my main.c so the thing will be accessible there? (location (A) in the code)
  3. How can I appropriately define an input to thing's method here? Do I need to use one of the many types defined in compileme.c? (location (B) in the code)
  4. How do I compile main.c with the proper linkages?
  5. In doing all this, it appears I have to include references to the python header files from the python-dev package. Just to be clear, I am not actually including the interpreter by doing this, correct?

Here are some resources I've found during my search that prove it is possible to compile a simple python script to an executable compiled C program: Compile main Python program using Cython http://masnun.rocks/2016/10/01/creating-an-executable-file-using-cython/

Here is some relevant cython documentation: http://cython.readthedocs.io/en/latest/src/reference/compilation.html

Pavel Komarov
  • 1,153
  • 12
  • 26
  • Possible duplicate of [Call python code from c via cython](https://stackoverflow.com/questions/22589868/call-python-code-from-c-via-cython) – Eli Korvigo Jan 22 '18 at 01:59
  • Generally saving objects externally in this manner is to persist the data; you generally can’t serialiaze out logic from the object. You’d import the data into the other process and deserialiaze it back into the class/object with methods defined in that program. – Joe Jan 22 '18 at 02:10
  • @EliKorvigo, Doesn't that Py_Initialize() stuff actually invoke a python runtime environment? I would like to avoid that because my embedded application will not have python. – Pavel Komarov Jan 22 '18 at 02:42
  • @Joe Yes, pickle is for saving from python and reading back to python, usually. The idea here is I would have something compiled which encapsulates the logic while the pickle encapsulates the object settings, and part of the logic is to read the settings at startup. – Pavel Komarov Jan 22 '18 at 02:47
  • 4
    You aren't going to be able to do this without linking to the Python interpreter. – DavidW Jan 22 '18 at 07:12
  • @DavidW So even cython's resulting C files depend on the interpreter? What about those examples I linked where they translate a hello world program to pure C? Could you fill out my example so I can at least have working minimal code? – Pavel Komarov Jan 22 '18 at 12:39
  • "Cython works by producing a standard Python module. However, the behavior differs from standard Python in that the module code, originally written in Python, is translated into C. While the resulting code is fast, it makes many calls into the CPython interpreter and CPython standard libraries to perform actual work. Choosing this arrangement saved considerably on Cython's development time, but modules have a dependency on the Python interpreter and standard library." – Pavel Komarov Jan 29 '18 at 18:02

1 Answers1

2

I'm afraid this answer just explains why I don't think what you want is realistic, rather than offering solutions. It's worth looking at the code that Cython generates for a slightly modified compileme.pyx

cdef public get_unpickled():
    import pickle
    return pickle.load(open('thing.pkl', 'r'))

This creates a function that you can happily call from C (the signature is generated in compileme.h and is __PYX_EXTERN_C PyObject *get_unpickled(void);). The generated '.c' file containing the implementation is quite long, but the relevant section looks like:

__pyx_t_1 = __Pyx_Import(__pyx_n_s_pickle, 0, -1);
__pyx_t_2 = __Pyx_PyObject_GetAttrStr(__pyx_v_pickle, __pyx_n_s_load);
__pyx_t_3 = __Pyx_PyObject_Call(__pyx_builtin_open, __pyx_tuple_, NULL);
__pyx_t_1 = __Pyx_PyObject_CallOneArg(__pyx_t_2, __pyx_t_3);

I've cut this down quite a bit for clarity (mostly removing reference counting and some checks) but you can see it uses the Python import mechanism to load the pickle module from the Python standard library, it does getattr to get the function load. It calls the Python builtin open and then it calls pickle.load. All of these operations need libpython.

Then we consider what pickle does - it basically gets the .py file your class came from, imports that, and creates a new instance of your class, then populates the instance dictionary with data from the file (possibly calling some special methods if present). Again, this is entirely dependent on the using Python.

Finally let's consider what you can do with the result of get_unpickled. You have a PyObject*, a fairly opaque C structure. Most of its information is probably stored in its internal Python dictionary, which you can access through the Python C API PyObject_GetAttrString and related functions. However this data is still stored as other PyObjects which you will need to access using the Python C API. (If it's a Cython class the data may be stored in more accessible C struct fields which require less use of libpython, but probably not none).


In summary, Cython is largely implemented using the Python C API, which requires access to the libpython library for anything but the absolute most trivial programs. Using Python standard library functions such as pickle requires the Python standard library is installed too. Therefore you can't really achieve this without needing to bundle Python with your C program. The examples that you linked fall in this category - they are C programs but they depend on Python being present.


A better solution might be to look at common serialization formats that both Python and C support, such as JSON, XML, or HDF5 to allow you to save the data in one language and retrieve it in the other with as little effort as possible.

DavidW
  • 29,336
  • 6
  • 55
  • 86
  • Even a definitive explanation of why not is useful so I can know to quit trying. – Pavel Komarov Jan 22 '18 at 21:00
  • How bulky is it to bundle python? `libpython` is not a full-fledged interpreter, so these compiled versions should still be lighter and faster, right? Just not as light and fast as actually reimplementing things in C. – Pavel Komarov Jan 22 '18 at 21:07
  • On my PC it's about 3MB (the Python executable is actually only about 10kb - almost all the detail is actually in `libpython`). You also need some (but probably not all) of the Python standard library. That's around 30MB in total (but you might be able to select the parts you need). – DavidW Jan 22 '18 at 21:48
  • I think you could use something like PyInstaller (or similar) to select only the libraries you use, but I've never used it myself so can't be of real help there – DavidW Jan 22 '18 at 21:49