2

http://docs.python.org/2/c-api/buffer.html

int ndim

The number of dimensions the memory represents as a multi-dimensional array. If it is 0, strides and suboffsets must be NULL.

What's the real world usage for this? Is it used for scatter gather vector buffers?

Community
  • 1
  • 1
est
  • 11,429
  • 14
  • 70
  • 118

1 Answers1

3

Using ndim and shape is primarily for multidimensional fixed-shape arrays. For example, if you wanted to build something like NumPy from scratch, you might build it around the buffer API. There are also variations to make things easy for NumPy, PIL, and modules that wrap typical C and Fortran array-processing libraries.

If you read a bit further down, the next two values both say "See complex arrays for more information." If you click that link, it gives you an example of doing something like NumPy, and describes how it works.

Also see PEP 3118 for some rationale.

It's not (primarily) for jagged-shaped arrays, like the scatter/gather use. While you can use PIL-style suboffsets for that, it's generally simpler to just use a list or array of buffers (unless you're trying to interface with PIL, of course).

(The old-style buffer API did support a mode designed specifically for scatter/gather-like use, but it was dropped in Python 3.x, and deprecated in 2.6+ once the 3.x API was backported, basically because nobody ever used it.)

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • thanks a million! I am noob to scatter/gather, any recommend/related info to get started with? (preparing to write network/file IO tasks for exercise.) – est Nov 27 '13 at 02:16
  • 1
    @est: Python 2.x doesn't provide any scatter/gather APIs. Python 3 has [`recvmsg_into`](http://docs.python.org/3.3/library/socket.html#socket.socket.recvmsg_into) and [`sendmsg`](http://docs.python.org/3.3/library/socket.html#socket.socket.sendmsg) on its sockets, but they work by just providing an iterable of 1D buffers (a `list` of `bytearray`s works fine), so there's no need for anything more fancy. If you're using some third-party library, or `ctypes`-ing direct to platform-specific functions like `sendv`, then… well, each one will have different requirements. – abarnert Nov 27 '13 at 02:26
  • 1
    @est: For example, on most *nix platforms with `sendv` and/or `writev`, you have to create an array of `struct iovec` objects, each of which has a buffer pointer and a length. No Python type has exactly that shape, so you have to build it out of a `ctypes.Structure`. – abarnert Nov 27 '13 at 02:29
  • Could you explain how to implement jagged-shaped arrays using suboffsets? The length of the second dimension is constant even if you have PIL-style pointer list in the first dimension isn't it? – molnarg Dec 23 '13 at 13:14
  • @molnarg: When you're creating a PIL pixel array, the length of the second dimension is constant—but there's no reason it has to be. Instead of creating a C array of arrays and two length variables, create a C array of arrays, a outer length, and an array of inner lengths, then your `bf_getsegcount` returns the outer length, and each `bf_getcharbuffer`/`bf_getreadbuffer`/etc. returns the segment length from the inner length array. – abarnert Dec 23 '13 at 18:13
  • 1
    @molnarg: [Here](http://pastebin.com/Z7jQU9DR) is an implementation of the relevant parts (with incomplete error handling—e.g., if you run out of memory halfway through allocating the Jagged, you'll leak all the buffers allocated so far). – abarnert Dec 23 '13 at 19:09
  • @abarnert thanks for the insightful code! You're right, its doable if one uses list of lists. – molnarg Dec 23 '13 at 21:37
  • @molnarg: Well, I was using C arrays of C arrays… but yeah, you could actually use a list of C arrays, or even (if you don't want `getcharbuffer` support) a list of lists (with the same tricks for using a list for a flat buffer). However, I'm really not sure you want to put too much effort into learning this—remember, it's a deprecated protocol, and even in 2.x there's very little existing code that will take advantage of your jagged 2D buffers, and if you're writing both the producer and consumer side in C why even bother to go through Python buffers in between? – abarnert Dec 23 '13 at 21:48