5

It seems that since access to NumPy array data doesn't require calls into the Python interpreter, C extensions can manipulate these arrays after releasing the GIL. For instance, in this thread.

The built-in Python type bytearray supports the Buffer Protocol, one member of which is

void *buf

A pointer to the start of the logical structure described by the buffer fields. [...] For contiguous arrays, the value points to the beginning of the memory block.

My question is, can a C extension manipulate this buf after releasing the GIL (Py_BEGIN_ALLOW_THREADS) since accessing it no longer requires calls to the Python C API? Or does the nature of the Python garbage collector forbid this, since the bytearray, and its buf, might be moved during execution?

Community
  • 1
  • 1
jpavel
  • 183
  • 1
  • 5
  • 1
    The `Py_buffer` struct holds a reference, and the `bytearray` can't be resized until all buffer exports have been released (i.e. while `ob_exports > 0`). – Eryk Sun Nov 06 '12 at 10:27
  • Thanks - and I take that to mean too that the buf won't be moved (as opposed to released) by the GC while the buffer is exported? Does Python even use a [moving GC](http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Moving_vs._non-moving)? – jpavel Nov 07 '12 at 07:43
  • The `bytearray` is deallocated when the reference count reaches 0. Python's GC attempts to resolve reference cycles in container types (e.g. lists, classes). – Eryk Sun Nov 07 '12 at 11:43
  • To clarify my own question: provided that the bytearray's reference count didn't decrease to 0, one knows that it won't be released. However, had Python a compacting GC, the object might have been moved by another thread (the one running GC) while the non-GIL extension thread was operating on the old memory address. However, since Python doesn't use a compacting GC ([ex. this discussion](http://python.6.n6.nabble.com/gc-ideas-sparse-memory-td1890016.html)), this won't be a problem, and the above technique should work fine. – jpavel Nov 08 '12 at 19:04
  • Python's GC augments (not replaces) reference counting for containers (not a `bytearray`). A tracked type is flagged as `Py_TPFLAGS_HAVE_GC` and has the method `tp_traverse` (and `tp_clear` if it's mutable) as [described here](http://docs.python.org/2/c-api/gcsupport.html). Pointers for next/prev are prepended to tracked objects to support GC linked lists. – Eryk Sun Nov 09 '12 at 03:43
  • 1
    If you're interested in Python's memory allocation strategy for objects, the comments in [obmalloc.c](http://hg.python.org/cpython/file/70274d53c1dd/Objects/obmalloc.c#l20) cover it in some detail. Allocations larger than 256 bytes are punted to `malloc`. For small objects, memory is allocated out of 4 KiB pools. Each pool has a storage class (mutable as the pool gets reused), with block sizes ranging from 8 to 256 bytes in multiples of 8. Pools are allocated from the system in 256 KiB arenas. – Eryk Sun Nov 09 '12 at 04:19
  • Those are quite helpful references, particularly pointing me to obmalloc.c! – jpavel Nov 11 '12 at 00:26
  • 1
    Also note that in [Python 3.3, obmalloc.c](http://hg.python.org/cpython/file/bd8afb90ebf2/Objects/obmalloc.c) has been updated to use anonymous `mmap` to allocate arenas in order to reduce heap fragmentation (see [issue 11849](http://bugs.python.org/issue11849)). Also the maximum block size was increased from 256 up to 512 bytes to better accommodate 64-bit systems (e.g. a dict object in a 64-bit system is 280 bytes, which before would have been allocated with `malloc`). – Eryk Sun Nov 11 '12 at 06:00

1 Answers1

3

To clarify the short answer written as comment: you can access the *buf data without holding the GIL, provided you are sure that the Py_buffer struct is "owned" by the thread while it is running without the GIL.

For the sake of completeness, I should add that this may open the door to (very remote) crashes risks: if the GIL-less thread reads the data at *buf while at the same time another GIL-holding thread is running Python code that changes the same data (bytearray[index]=x) then the GIL-less thread can see unexpected changes of the data under its feet. The opposite is true too, and even more annoying (but still theoretical): if the GIL-less thread changes the data at *buf, then other GIL-holding, Python-running threads might see strange results or even maybe crashes if doing some complex reading operations like bytearray.split().

Armin Rigo
  • 12,048
  • 37
  • 48