3

I need to get the data pointer (i.e. the memory offset of the first data byte) of a Python byte string or buffer. I solved it with ctypes:

import ctypes

def get_data_ofs(buf):
  import ctypes
  data = ctypes.c_char_p()
  ctypes.pythonapi.PyObject_AsCharBuffer(ctypes.py_object(buf), ctypes.pointer(data),
                                         ctypes.pointer(ctypes.c_size_t()))
  return ctypes.cast(data, ctypes.c_void_p).value

x = 'hello'

if type(zip()) is not list:  # Python 3.
  def buffer(s, start=0):
    return memoryview(s)[start:]
  x = bytes(x, 'utf-8')

# The next 3 lines must print the same integer.
print(ctypes.memmove(x, x, 0) + 1)
print(get_data_ofs(x) + 1)
print(get_data_ofs(buffer(x, 1)))

I've verified that my solution (get_data_ofs) works in Python 2 and 3.

Is there a solution which doesn't use ctypes, or is simpler?

pts
  • 80,836
  • 20
  • 110
  • 183
  • If you use `id()`, you will not have the pointer but the memory location of the object. https://stackoverflow.com/questions/15667189/what-is-the-id-function-used-for – Jona Dec 16 '19 at 21:47
  • @Jona: Indeed, `id(x)` is not a solution to my question. – pts Dec 16 '19 at 22:55
  • I doubt you could do it in plain *Python*. Those are implementation details (that might change), that the end user simply doesn't need to know about. I wonder why do you need it. Also note that when constructing some *CTypes* objects (out of *Python* ones), memory is being copied (not shared, so the address might be bound to that specific *CTypes* type instance). – CristiFati Dec 17 '19 at 11:25
  • @CristiFati: There is no copying/sharing issue in the code I posted, you can see it yourself by running it. Of course, the returned pointer is valid only while the backing string object (`x`) exists, but that's true for all possible solutions. I agree that most Python code doesn't need this data pointer, however some Python code interfacing with C code needs, e.g. *ctypes* is used to load code written in C, the C code needs a *const char \**, the Python code only has a *buffer* (because the caller passed a *buffer* rather than a *bytes*), and copying *buffer* to *bytes* would be too slow. – pts Dec 17 '19 at 16:04
  • Is this an [XY Problem](https://meta.stackexchange.com/q/66377)? Why do you need the address? – Mark Tolonen Dec 18 '19 at 02:50
  • 1
    @MarkTolonen: No, it's not an XY problem, I do need the data pointer. I already provided a typical use case in my previous comment. – pts Dec 18 '19 at 19:34

0 Answers0