0

I am using a Python library that wraps LibTCC called PyTCC.

I am experimenting with ways to JIT compile code in Python. Problem is, when calling a function I can return normal C data types correctly but I get an "Access Violation" error when returning any PyObject *.

I have made sure that code can execute from PyTCC as my code example shows. This also means that the code example is compiling successfully.

import ctypes, pytcc

program = b"""
#include "Python.h"

/* Cannot return 3 due to access violation */
PyObject * pop(PyObject * self, PyObject * args, PyObject * kwargs) {
    // Cannot return *any* Python object
    return PyLong_FromLong(3);
}

int foobar() { return 3; }  // Returns 3 just fine

// Needed to appease TCC:
int main() { }
"""

jit_code = pytcc.TCCState()
jit_code.add_include_path('C:/Python37/include')
jit_code.add_library_path('C:/Python37')
jit_code.add_library('python37')
jit_code.compile_string(program)
jit_code.relocate()

foobar_proto = ctypes.CFUNCTYPE(ctypes.c_int)
foobar = foobar_proto(jit_code.get_symbol('foobar'))

print(f'It works: {foobar()}')

pop_proto = ctypes.CFUNCTYPE(ctypes.c_voidp)
pop = pop_proto(jit_code.get_symbol('pop'))

print('But this does not for some reason:')
print(pop())
print('Never gets here due to access violation :(')

The output of the program should be:

It works: 3
But this does not for some reason:
3
Never gets here due to access violation :(

But instead, I am getting this exact error:

It works: 3
But this does not for some reason:
Traceback (most recent call last):
  File "fails.py", line 40, in <module>
    print(pop())
OSError: exception: access violation writing 0x00000000FFC000E9

2 Answers2

0

Most likely it's because you don't have the GIL when creating the object. You also have an issue with the return type. ctypes.c_voidp tells python to treat it like an int instead of a PyObject, so all you'd see if it wasn't for the access violation is the value pointer itself not what it's pointing at.

Try:

    PyObject * pop() {
    PyGILState_STATE gstate;
    gstate = PyGILState_Ensure();
    PyObject* obj = PyLong_FromLong(10);
    PyGILState_Release(gstate);
    return obj;
}

and switch
pop_proto = ctypes.CFUNCTYPE(ctypes.c_voidp)
to
pop_proto = ctypes.CFUNCTYPE(ctypes.py_object)

output from my run (changed the value from 3 to 10 in the pyobject just to show it made it)

It works: 3
But this does not for some reason:
10
Never gets here due to access violation :(
estabroo
  • 189
  • 5
  • Another possibilty would be to use `pop_proto = cytpes.PYFUNCTYPE(ctypes.py_object)` since that should indicate to not release the GIL to begin with instead of its normal behavior. *Note* I didn't test that. – estabroo Apr 09 '19 at 20:56
  • Thank you for reply! I have made those changes and am getting the same error. I have also tried the proto with: `pop_proto = ctypes.PYFUNCTYPE(ctypes.py_object)` and also `pop_proto = ctypes.CFUNCTYPE(ctypes.py_object)` to no avail. In addition, I also copy-and-pasted your GIL code example and it did not work. Since you have successfully compiled and run, what platforms (OS/Python) did you test this on? – Samuel Wilder Apr 10 '19 at 12:48
  • debian linux (sid), python 3.6, pytcc that you linked, using tcc version 0.9.27-8 (had to manually compile it for the shared library since debian used the static version in it's package). It had a few patches from baseline -- `turn off stack-protector in runtime`, `dead code removal on case statements x86 specific`, and `adding pthread library name for linux threaded builds`, and the only changes I made to your python script were the ones posted and the making the load paths work for linux `jit_code.add_include_path('/usr/include/python3.6m') ... ` – estabroo Apr 10 '19 at 17:33
  • I'm banging my head against the wall on this one. After only making the changes you suggested, I am still getting the error. Do you happen to have access to a Windows computer that you could test? – Samuel Wilder Apr 11 '19 at 17:09
  • I don't unfortunately. I wonder if it is the stack protector patch on mine that is allowing it to work. Can you recompile your tcc and add -fno-stack-protector to the CFLAGS (or whatever the equivalent for the windows compiler) ? – estabroo Apr 11 '19 at 20:42
  • Just tried that. I recompiled from source and confirmed that the `GS-` [flag](https://learn.microsoft.com/en-us/cpp/build/reference/gs-buffer-security-check?view=vs-2019) was passed to disable the buffer security check. Unfortunately, I got the same error. I'm using PyTCC because I don't know of another library that can compile C code on the fly from Python. Is there another option I could try that you know of? – Samuel Wilder Apr 12 '19 at 15:53
  • Would something like [numba](http://numba.pydata.org) do what you want? It jits python code instead of inlining C code – estabroo Apr 12 '19 at 16:43
  • That's a great idea but I specifically need to compile C code dynamically and quickly (which would make LibTCC a perfect candidate). – Samuel Wilder Apr 15 '19 at 12:56
  • I wonder if it's an OS config thing - try turning off the data execution protection and run it that way. https://www.online-tech-tips.com/windows-xp/disable-turn-off-dep-windows/ (has instructions for newer versions of windows as well) – estabroo Apr 15 '19 at 19:57
  • Ok. I tried looking into DEP and it is already disabled for all but Windows-specific executables. I have some additional info on the problem though: this function: ``` char duh() { malloc(sizeof(int)); return 'c'; } ``` Will execute just fine without the call to `malloc` and will return the 'c'. However, when the `malloc` is added, it causes the access violation. So what I have determined is that the C code cannot allocate memory. This seems like it has something to do with DEP and non-executable memory pages but I'm not experienced enough to go and actually view it. – Samuel Wilder Apr 30 '19 at 13:46
  • From the Application Compatability section of [ms dep info](https://learn.microsoft.com/en-us/windows/desktop/memory/data-execution-prevention) it looks like IMAGE_SCN_MEM_EXECUTE needs to be set for the code to be able to execute. tcc defines it but doesn't seem to use it, so it's probably not something pytcc can set without modifying it. You might have to disable DEP altogether if you don't want to modify tcc and pytcc since it seems to be affecting things its not turned on explicitly for, though that puts your machine at extra risk. It would be an interesting test (imho). – estabroo Apr 30 '19 at 14:54
  • Thank you for the info :). However, I have found another solution to the issue. I found another library of the same name: [PyTCC](https://github.com/thgcode/pytcc) that initially gave me the *exact* same error. I opened an issue on the repo and [thgcode](https://github.com/thgcode) got back to me the same day with a fix. I have gotten the same example using the new API and your changes to work successfully! Thank you very much for your time and I highly recommend taking a look at [PyTCC](https://github.com/thgcode/pytcc). – Samuel Wilder May 01 '19 at 12:44
0

Didn't work with PyTCC, but there's something wrong in the code.

According to [Python 3]: class ctypes.PyDLL(name, mode=DEFAULT_MODE, handle=None) (emphasis is mine):

Instances of this class behave like CDLL instances, except that the Python GIL is not released during the function call, and after the function execution the Python error flag is checked. If the error flag is set, a Python exception is raised.

Thus, this is only useful to call Python C api functions directly.

Note: CFUNCTYPE is for CDLL, same thing that PYFUNCTYPE is for PyDLL.

As a consequence, in pop_proto, you should replace ctypes.CFUNCTYPE with ctypes.PyFUNCTYPE (note that you have a typo in c_voidp).

Next, same page states that for PyObject* (C), py_object should be used (Python). So:

pop_proto = ctypes.PyFUNCTYPE(ctypes.py_object)

If you want to be rigorous, you'll have to include the arguments in the prototype, which would make the code look a bit more complicated, but for this particular case (they are ignored), it's not mandatory:

pop_proto = ctypes.PyFUNCTYPE(ctypes.py_object, ctypes.py_object, ctypes.py_object, ctypes.py_object)

Here's an example for PyObject *PyBytes_Repr(PyObject *obj, int smartquotes) (calling the C function in the "old fashioned" way):

[cfati@CFATI-5510-0:C:\WINDOWS\system32]> "e:\Work\Dev\VEnvs\py_064_03.07.03_test0\Scripts\python.exe"
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import sys
>>> import os
>>> import ctypes
>>>
>>> python_dll_name = os.path.join(os.path.dirname(sys.executable), "python" + str(sys.version_info.major) + str(sys.version_info.minor) + ".dll")
>>> python_dll_name
'e:\\Work\\Dev\\VEnvs\\py_064_03.07.03_test0\\Scripts\\python37.dll'
>>>
>>> python_dll = ctypes.PyDLL(python_dll_name)
>>>
>>> pybytes_repr_proto = ctypes.PYFUNCTYPE(ctypes.py_object, ctypes.py_object, ctypes.c_int)
>>> pybytes_repr = pybytes_repr_proto(("PyBytes_Repr", python_dll))
>>>
>>> b = b"abcd"
>>>
>>> reprb = pybytes_repr(b, 0)
>>> reprb
"b'abcd'"

You might also check [SO]: How to cast a ctypes pointer to an instance of a Python class (@CristiFati's answer).

CristiFati
  • 38,250
  • 9
  • 50
  • 87
  • Thank you for the reply! I have implemented your recommendations and unfortunately it is still throwing an access violation error. I'm just confused because I can return data from a normal C function but not a CPython function, even when manipulating the GIL. – Samuel Wilder Apr 10 '19 at 15:06
  • With the current changes (actually only *py\_object* is required), it works when building the *C* code in separate *.dll* (original code **doesn't crash**). But for rigorousity's sake it's best to also keep *PYFUNCTYPE*. It seems that *libtcc* adds some extra stuff into play. – CristiFati Apr 11 '19 at 10:02