Buffers and Memoryview Objects explained for the non-C programmer

Question

Python 2.7 has introduced a new API for buffers and memoryview objects.

I read the documentation on them and I think I got the basic concept (accessing the internal data of an object in a raw form without copying it, which I suppose means a "faster and less memory-hungry" way to get object data), but to really understand the documentation, the reader should have a knowledge of C that is beyond the one I have.

I would be very grateful if somebody would take the time to:

explain buffers and memoryview objects in "layman terms" and
describe a scenario in which using buffers and memoryview objects would be "the Pythonic way" of doing things

Take a look at [this answer](http://stackoverflow.com/questions/3422685/what-is-python-buffer-type-for/3422740#3422740), which explains their use in pure Python, though it doesn't go into the C API. — Scott Griffiths, Jul 19 '11 at 08:55
@Scott - Already read (and upvoted) before I posted this question. :) Very useful indeed. @agf's answer (and link!) helped me understand more... Still the C-API thing confuses me a bit: is that only mentioned to explain the rationale for the creation of the `memoryview` type, or is there something I should absolutely know about it? And also: why is it not possible to write objects exposing the buffer interface? It's a design choice from Guido & Co. or it's an implicit limitation of python internal working? — mac, Jul 19 '11 at 09:01
@agf - Thank you for the comment, but I think my comment was misunderstood: I was referring to @Scott's own answer that he linked in his comment. The last sentence of the answer is: "Note also that you can't implement a buffer interface for your own objects without delving into the C API, i.e. you can't do it in pure Python." — mac, Jul 19 '11 at 09:24
@mac Yeah, I missed that. I think the C-API is mentioned so often because until Python 2.6, it was the only way to use the (old) buffer interface / protocol. So people already familiar with "buffers" in Python think of them in that light. You don't need to know anything about it to use the 2.6+ buffer object or the 2.7+ memoryview object. It does, however, explain why buffer objects are read only: because buffers have always been read only from code written in Python, so we need to call a read-write buffer available to Python code something else (a memoryview). — agf, Jul 19 '11 at 09:40
My understanding is that there's no fundamental reason why pure Python types couldn't implement a memoryview interface, it just hasn't been provided. I guess that if you really need it then you're probably doing something fairly low-level and performance critical and are already using C (or perhaps should be using it). That's not my experience, but either way I don't think we'll get pure Python memoryview types any time soon. — Scott Griffiths, Jul 19 '11 at 09:44

score 5 · Accepted Answer · edited May 23 '17 at 12:16

Here's a line from a hash function I wrote:

M = tuple(buffer(M, i, Nb) for i in range(0, len(M), Nb))

This will split a long string, M, into shorter 'strings' of length Nb, where Nb is the number of bytes / characters I can handle at a time. It does this WITHOUT copying any parts of the string, as would happen if I made slices of the string like so:

M = tuple(M[i*Nb:i*Nb+Nb] for i in range(0, len(M), Nb))

I can now iterate over M just as I would had I sliced it:

H = key
for Mi in M:
    H = encrypt(H, Mi)

Basically, buffers and memoryviews are efficient ways to deal with the immutability of strings in Python, and the general copying behavior of slicing etc. A memoryview is just like a buffer, except you can also write to it, not just read.

While the main buffer / memoryview doc is about the C implementation, the standard types page has a bit of info under memoryview: http://docs.python.org/library/stdtypes.html#memoryview-type

Edit: Found this in my bookmarks, http://webcache.googleusercontent.com/search?q=cache:Ago7BXl1_qUJ:mattgattis.com/2010/3/9/python-memory-views+site:mattgattis.com+python&hl=en&client=firefox-a&gl=us&strip=1 is a REALLY good brief writeup.

Edit 2: Turns out I got that link from When should a memoryview be used? in the first place, that question was never answered in detail and the link was dead, so hopefully this helps.

Wait... can you actually *mutate strings* (and/or other immutable objects) using a `memoryview`? — kindall, Nov 13 '13 at 16:10
No - you can only mutate mutable objects, like a bytearray. memoryview has a `readonly` attribute, which will tell you if you can. — Russia Must Remove Putin, Dec 28 '16 at 01:30

score 1 · Answer 2 · edited Oct 18 '16 at 19:59

1

Part of the answer I was looking for is that buffer is the "old way", that memoryview is the new way, but was backported to 2.7 - see the archived blog here

This doesn't answer my question of why the C API I thought I implemented in 2.7 lets me construct a buffer but not a memoryview...

To get memoryview to work in Python 2.7, you need to have the Py_TPFLAGS_HAVE_NEWBUFFER flag set in tp_flags. I found that the built-in bytearray source was a good reference; it is in Include/bytearrayobject.h and Objects/bytearrayobject.c.

edited Oct 18 '16 at 19:59

Mr_and_Mrs_D

32,208
39
178
361

answered Nov 13 '13 at 16:06

Ben

9,184
1
43
56

The link in this answer is dead. – shuttle87 Nov 19 '15 at 17:32
I resurrected the link buhahah – Mr_and_Mrs_D Oct 18 '16 at 20:00

Buffers and Memoryview Objects explained for the non-C programmer

2 Answers2

Linked