1

I am trying to interoperate between Python and C++.

This is my C++ code for a test DLL method:

extern "C" __declspec(dllexport) PEParserNamespace::PEParserBase& _cdecl test(PEParserNamespace::PEParserBase* base) {
    printf("the C++ function was called\n");
    base->bytes = 12345;
    return *base;
}

I try to use it from Python like so:

import ctypes
#DataStructures.py
class PEParserBase(ctypes.Structure):
    _fields_ = [("hFile", ctypes.c_void_p),
        ("dwFileSize", ctypes.c_ulong),
        ("bytes", ctypes.c_ulong),
        ("fileBuffer",ctypes.c_void_p)]
class PEHEADER(ctypes.Structure):
    xc = 0
#FunctionWrapper.py
def testWrapper(peParserBase, _instanceDLL):
    _instanceDLL.test.argtypes = [ctypes.POINTER(PEParserBase)]
    _instanceDLL.test.restype = PEParserBase
    return _instanceDLL.test(ctypes.byref(pEParserBase))

pEParserBase = PEParserBase()
print("hallo welt")
_test = ctypes.CDLL('PeParserPythonWrapper.dll')

print(id(testWrapper(pEParserBase, _test)))
print(id(pEParserBase))

I expected that testWrapper to return the original PEParserBase instance, but it doesn't - the reported id values are different. The C++ code doesn't create any new instances of PEParserBase or anything else, so I'm confident the problem has to be in the Python code.

Why does this happen, and how do I fix it?

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Paul Gigel
  • 41
  • 4
  • 1
    You can replace `id` by `ctypes.adressof` for comparing. If you really need identity here, you must keep a dict mapping the address to the Python object and write functions to manage this dict. – Michael Butscher Mar 30 '23 at 07:16
  • 1
    Welcome to Stack Overflow. It is fine if your English is not native, but please still try to write about **the problem, not** yourself - and try to ask a clear question, directly, and without conversation. I edited the post to fix the writing, and to meet site standards. For more information, please read [ask] and [Should 'Hi', 'thanks', taglines, and salutations be removed from posts?](https://meta.stackexchange.com/questions/2950). – Karl Knechtel Mar 30 '23 at 07:22
  • 1
    @MichaelButscher seems worth writing up as an answer (including an explanation for why it doesn't work as-is; I assume there is some kind of implicit copy involved in crossing a DLL boundary, or something like that?), if there is no applicable duplicate (definitely not my field of expertise, so I can't easily search for one). – Karl Knechtel Mar 30 '23 at 07:29

2 Answers2

1

Listing [Python.Docs]: ctypes - A foreign function library for Python.

You have Undefined Behavior (actually a bunch of them):

  • CTypes (as the name suggests) works with C. A reference is a C++ specific concept (C knows nothing about it)

  • Due to the fact that a reference is actually a memory address (just like a pointer, but with some key differences), your Python function prototype (restype) is incorrect. Check [SO]: C function called from Python via ctypes returns incorrect value (@CristiFati's answer) for more details.
    Note: specifying a pointer instead would do the trick, but it's still technically incorrect (if one wants to be rigorous)

CTypes objects are Python wrappers ([Python.Docs]: Common Object Structures - type PyObject) over the actual C ones.
[Python.Docs]: Built-in Functions - id(object) returns the address of the Python wrapper object (which is different than the wrapped's).
Instead, you should use ctypes.addressof.

I prepared a small example (I defined the C structure based on the Python one and member names - also drawn the conclusion that you're on Win (but all the ideas in the answer are OS agnostic)).

  • dl00.cpp:

    #include <stdio.h>
    
    #if defined(_WIN32)
    #  include <Windows.h>
    #  define DLL00_EXPORT_API __declspec(dllexport)
    #else
    #  define DLL00_EXPORT_API
    #endif
    
    
    struct PEParserBase {
        HANDLE hFile;
        DWORD dwFileSize;
        ULONG bytes;
        LPVOID fileBuffer;
    };
    
    typedef PEParserBase *PPEParserBase;
    
    
    #if defined(__cplusplus)
    extern "C" {
    #endif
    
    DLL00_EXPORT_API PPEParserBase func00ptr(PPEParserBase parg);
    
    #if defined(__cplusplus)
    }
    #endif
    
    
    PPEParserBase func00ptr(PPEParserBase parg)
    {
        printf("  C:\n    Address: 0x%0zX\n", reinterpret_cast<size_t>(parg));
        if (parg) {
            printf("    dwFileSize: %u\n    bytes: %u\n", parg->dwFileSize, parg->bytes);
            parg->dwFileSize = 123;
        }
        return parg;
    }
    
  • code00.py:

    #!/usr/bin/env python
    
    import ctypes as cts
    import ctypes.wintypes as wts
    import sys
    
    
    DLL_NAME = "./dll00.{:s}".format("dll" if sys.platform[:3].lower() == "win" else "so")
    
    
    class PEParserBase(cts.Structure):
        _fields_ = (
            ("hFile", wts.HANDLE),
            ("dwFileSize", wts.DWORD),
            ("bytes", wts.ULONG),
            ("fileBuffer", wts.LPVOID),
        )
    
        def __str__(self):
            ret = [self.__repr__()]
            for field, _ in self._fields_:
                ret.append("  {:s}: {:}".format(field, getattr(self, field)))
            return "\n".join(ret)
    
    PEParserBasePtr = cts.POINTER(PEParserBase)
    
    
    def print_ctypes_obj(obj):
        _id = id(obj)
        try:
            _addrof = cts.addressof(obj)
        except TypeError:
            _addrof = 0
        print("Id: 0x{:016X}\nAddressOf: 0x{:016X}\n{:s}\n".format(_id, _addrof, str(obj)))
    
    
    def main(*argv):
        dll = cts.CDLL(DLL_NAME)
        func00ptr = dll.func00ptr
        func00ptr.argtypes = (PEParserBasePtr,)
        func00ptr.restype = PEParserBasePtr
    
        use_ptr = 1
        if use_ptr:
            print("Test pointer export\n")
            pb0 = PEParserBase(None, 3141593, 2718282, None)
            print_ctypes_obj(pb0)
            ppb0 = cts.byref(pb0)
            print_ctypes_obj(ppb0)
            ppb1 = func00ptr(ppb0)
            print()
            print_ctypes_obj(ppb1)
            pb1 = ppb1.contents
            print_ctypes_obj(pb1)
    
    
    if __name__ == "__main__":
        print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
                                                       64 if sys.maxsize > 0x100000000 else 32, sys.platform))
        rc = main(*sys.argv[1:])
        print("\nDone.\n")
        sys.exit(rc)
    

Output:

[cfati@CFATI-5510-0:e:\Work\Dev\StackExchange\StackOverflow\q075885080]> sopr.bat
### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ###

[prompt]>
[prompt]> "c:\Install\pc032\Microsoft\VisualStudioCommunity\2019\VC\Auxiliary\Build\vcvarsall.bat" x64 > nul

[prompt]> cl /nologo /MD /DDLL dll00.cpp  /link /NOLOGO /DLL /OUT:dll00.dll
dll00.cpp
   Creating library dll00.lib and object dll00.exp

[prompt]>
[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.10_test0\Scripts\python.exe" ./code00.py
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] 064bit on win32

Test pointer export

Id: 0x0000013F2CB60740
AddressOf: 0x0000013F2C30FE30
<__main__.PEParserBase object at 0x0000013F2CB60740>
  hFile: None
  dwFileSize: 3141593
  bytes: 2718282
  fileBuffer: None

Id: 0x0000013F2CBA72B0
AddressOf: 0x0000000000000000
<cparam 'P' (0x0000013F2C30FE30)>

  C:
    Address: 0x13F2C30FE30
    dwFileSize: 3141593
    bytes: 2718282

Id: 0x0000013F2CB607C0
AddressOf: 0x0000013F2CB60808
<__main__.LP_PEParserBase object at 0x0000013F2CB607C0>

Id: 0x0000013F2CB60AC0
AddressOf: 0x0000013F2C30FE30
<__main__.PEParserBase object at 0x0000013F2CB60AC0>
  hFile: None
  dwFileSize: 123
  bytes: 2718282
  fileBuffer: None


Done.
CristiFati
  • 38,250
  • 9
  • 50
  • 87
0

id() in CPython returns the address of the Python ctypes wrapper object, not the address of the object wrapped. For that, use ctypes.addressof().

ctypes also only understands C plain old data (POD) types and structures. It doesn't know C++ namespaces or references, but since references are really C++ syntactic sugar for pointers, you can use a pointer in .argtypes and .restype as a replacement.

Here's a minimal example:

test.cpp

#include <stdio.h>

namespace PEParserNamespace {
struct PEParserBase {
    void* hFile;
    unsigned long dwFileSize;
    unsigned long bytes;
    void* fileBuffer;
};
}

extern "C" __declspec(dllexport)
PEParserNamespace::PEParserBase& _cdecl test(PEParserNamespace::PEParserBase* base) {
    printf("the C++ function was called\n");
    base->bytes = 12345;
    return *base;
}

test.py

import ctypes as ct

class PEParserBase(ct.Structure):
    _fields_ = (("hFile", ct.c_void_p),
                ("dwFileSize", ct.c_ulong),
                ("bytes", ct.c_ulong),
                ("fileBuffer",ct.c_void_p))

dll = ct.CDLL('./test')
dll.test.argtypes = ct.POINTER(PEParserBase),
dll.test.restype = ct.POINTER(PEParserBase)    # Use a pointer here for the reference

base = PEParserBase()
pbase = dll.test(ct.byref(base))

print(hex(ct.addressof(base)), base.bytes)
print(hex(ct.addressof(pbase.contents)), base.bytes)  # .contents dereferences the pointer
                                                  # so we get the address of the structure
                                                  # not the address of the pointer itself.

Output:

the C++ function was called, base=00000169712ADAF0
0x169712adaf0 12345
0x169712adaf0 12345
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251