forcing 32 bit access using python mmap?

Question

I run 64 bit python on a 64 bits arm processor. One on the AXI bus of this processor is connected to a FPGA (which does bus and clock domain changes down to a 32 bit wide bus). This piece of hardware does not like 64 bit accesses...

I am trying to access this FPGA via python mmap like this (within a class):

def __init__(self, base, len):
    self.base = base
    self.fd = open("/dev/mem", "r+")
    self.lw = mmap.mmap(self.fd.fileno(),
                        len,
                        mmap.MAP_SHARED,
                        mmap.PROT_READ | mmap.PROT_WRITE,
                        offset=base)

def get_U32(self, offset):
    s = self.lw[offset:offset+4]
    return struct.unpack("<I", s)[0]

The idea was to that get_U32() would read a 32 bit word from the bus (hence the offset to offset+4 reading). Sadly, it seems that mmap performs a 64 bit access to the bus anyway (some kind of caching for performance optimization I assume) and then performs the 32 bit "casting". The underlying FPGA is not happy...

In a C program, I simply write:

data = *((uint32_t *) address);

...and the CPU seems to gently perform a 32 bit access on its AXI bus, which the underlying hardware prefers.... (so right now, I have a (slow) workaround, where python requires a C program to interface the Hardware, via pipes)

Is there a way to force 64 bits python to perform a 32 bit access, as the previous C line obviously succeeds with?

The question is written about reading 32 bits here, but of course, writing 32 bits is needed as well...

Are you sure about that? On the FPGA, looking at the final 32 bit bus, I see only one strobe using the C program, while I see 2 strobes when using python: My assumption is therefore that the C program only accesses 32 bits. As far as I understand, some signaling on AXI allows for shorter accesses than the max 64 bits. Maybe I am wrong, as I cannot see that bus, but C and python accesses are clearly different at the far end. — user1159290, Mar 07 '19 at 08:41
If that C program is 64-bit, it is still using 64-bit to read the address. The assumption that it has something to do with the bus size doesn't really make sense. Besides, everything you do in the application is in it's virtual memory space, you are not even dealing with physical memory addresses. — Havenard, Mar 07 '19 at 08:44
`mmap` uses the OS to access the memory. The C code is directly accessing a 32-bit quantity via its address. You're comparing apples and oranges. — martineau, Mar 07 '19 at 08:45
@martineau. Agreed. can you do "apple" accesses from python, then? — user1159290, Mar 07 '19 at 08:47
You said your C program works, would you show us the relevant piece of code? Perhaps there's something in the way you are initializing the `mmap` that matters. — Havenard, Mar 07 '19 at 08:48
You might be able to do something like that with the `di()` function shown in [this answer](https://stackoverflow.com/questions/53937652/reference-a-byte-array-as-an-integer/53938035#53938035). You might also want to try using a 32-bit version of the Python interpreter to run your script. — martineau, Mar 07 '19 at 08:51
The C program uses mmap too, of course: fd = open("/dev/mem", O_RDWR | O_SYNC) map_base_axi = mmap(0, map_sz[1], PROT_READ | PROT_WRITE, MAP_SHARED, fd, map_addr); and then, the line I quoted — user1159290, Mar 07 '19 at 08:55
Yeah if you try it in 32-bit Python just to see what happens it could clarity things, I'm still finding it hard to believe it has something to do with the memory address size, specially when you are explicitly telling it to read 32-bits. — Havenard, Mar 07 '19 at 08:55
Try with `fd = os.open("/dev/mem", os.O_RDWR | os.O_SYNC)` it looks like you're using a different `open()`, if this is equivalent to C's `fopen()` it wont work, `fopen()` does indeed implements buffering. — Havenard, Mar 07 '19 at 09:11
@havenard. I liked your idea, buf sadly, using os.open bahaves the same. 2 probes on the 32 bit side... — user1159290, Mar 07 '19 at 12:16
OK. the read problem goes away with this: s = ctypes.c_uint32.from_buffer(self.lw, offset).value But the same trick does not not work for writes: ctypes.c_uint32.from_buffer(self.lw, offset).value=s generates first a read access and then a write access... — user1159290, Mar 07 '19 at 16:01

score 2 · Answer 1 · answered Mar 08 '19 at 07:32

Based on the idea from @martineau, the double probe can be fixed using python ctypes, like:

s = ctypes.c_uint32.from_buffer(self.lw, offset).value #read

or

types.c_uint32.from_buffer(self.lw, offset).value = s #write

This does indeed seem to force python into doing the same 32 bit access as in C, and remove the double read or write probe at the 32 bit bus.

However, sadly, python seems to do a read before each write. So the solution above works perfectly for reading, but when writing, I still get a read access before the write access. In C, I can, of course just get a single write access when writing.

I am posting this for others who may be interested. If you have a solution to this last issue (read before write), please post it.

We had luck with the `memoryview` solution from [this answer](https://stackoverflow.com/a/53492789) to the same issue. — Colm Ryan, Mar 11 '20 at 04:36

score 0 · Answer 2 · answered Jan 29 '21 at 03:47

Solution on this thread

You could try something like this:

 def __init__(self,offset,size=0x10000):
    self.offset = offset
    self.size = size
    
    mmap_file = os.open('/dev/mem', os.O_RDWR | os.O_SYNC)
    mem = mmap.mmap(mmap_file, self.size,
                    mmap.MAP_SHARED,
                    mmap.PROT_READ | mmap.PROT_WRITE,
                    offset=self.offset)
    os.close(mmap_file)
    self.array = np.frombuffer(mem, np.uint32, self.size >> 2)
    
def wread(self,address):
    idx = address >> 2
    return_val = int(self.array[idx])
    return return_val
    
def wwrite(self,address,data):
    idx = address >> 2
    self.array[idx] = np.uint32(data)

forcing 32 bit access using python mmap?

2 Answers2