2

I'm writing a userspace driver for accessing FPGA registers in Python 3.5 that mmaps the FPGA's PCI address space, obtains a memoryview to provide direct access to the memory-mapped register space, and then uses struct.pack_into("<I", ...) to write a 32-bit value into the selected 32-bit aligned address.

def write_u32(address, data):
    assert address % 4 == 0, "Address must be 32-bit aligned"
    path = path.lib.Path("/dev/uio0")
    file_size = path.stat().st_size
    with path.open(mode='w+b') as f:
        mv = memoryview(mmap.mmap(f.fileno(), file_size))
        struct.pack_into("<I", mv, address, data)

Unfortunately, it appears that struct.pack_into does a memset(buf, 0, ...) that clears the register before the actual value is written. By examining write operations within the FPGA, I can see that the register is set to 0x00000000 before the true value is set, so there are at least two writes across the PCI bus (in fact for 32-bit access there are three, two zero writes, then the actual data. 64-bit involves six writes). This causes side-effects with some registers that count the number of write operations, or some that "clear on write" or trigger some event when written.

I'd like to use an alternative method to write the register data in a single write to the memory-mapped register space. I've looked into ctypes.memmove and it looks promising (not yet working), but I'm wondering if there are other ways to do this.

Note that a register read using struct.unpack_from works perfectly.

Note that I've also eliminated the FPGA from this by using a QEMU driver that logs all accesses - I see the same double zero-write access before data is written.

I revisited this in 2022 and the situation hasn't really changed. If you're considering using memoryview to write blocks of data at once, you may find this interesting.

Braiam
  • 1
  • 11
  • 47
  • 78
davidA
  • 12,528
  • 9
  • 64
  • 96
  • I've looked into using `ctypes`, in particular `ctypes.from_buffer` and `ctypes.memmove`. The former works to a degree, but it does an initial read. The latter writes byte-by-byte so is unsuitable. I feel like I'm close - is there a way, perhaps using a ctypes pointer, to do an atomic write to the address that the pointer references? – davidA Nov 30 '18 at 02:46

2 Answers2

2

Perhaps this would work as needed?

mv[address:address+4] = struct.pack("<I", data)

Update:

As seen from the comments, the code above does not solve the problem. The following variation of it does, however:

mv_as_int = mv.cast('I')
mv_as_int[address/4] = data

Unfortunately, precise understanding of what happens under the hood and why exactly memoryview behaves this way is beyond the capabilities of modern technology and will thus stay open for the researchers of the future to tackle.

KT.
  • 10,815
  • 4
  • 47
  • 71
  • Thank you for your response. This does show an improvement - rather than three writes on the PCI bus (two zero + one data write) this only performs *two* writes, both with the data. I'm not sure if that's due to the way `memoryview` works. However I need to find a way to get it down to a single write. – davidA Nov 29 '18 at 21:54
  • Perhaps more interestingly, an attempt to write a single byte (to an aligned address) with `mv[address:address+1] = struct.pack(" – davidA Nov 29 '18 at 21:56
  • Interesting. What if you write by indexing, rather than slicing? I.e. `mv[address] = value`? – KT. Dec 01 '18 at 11:33
  • By the way multiple writes resemble cache flushes, so another debugging direction to try calling `mmap.flush` and/or `mmap.close` explicitly (and see at what point does the first and second data writes take place). – KT. Dec 01 '18 at 11:51
  • Trying to do `mv[address] = data` results in `mmap item value must be in range(0, 256)` (so is expecting a byte), and `mv[address] = struct.pack(" – davidA Dec 02 '18 at 23:06
  • 1
    I have tried using `i = ctypes.c_int.from_buffer(mv); i.value = data` and this writes the correct data as an atomic 32-bit write, but for some reason it does a 32-bit read of the same address just prior to the write. This at least is better than multiple writes, but it's still not ideal because some registers have behaviour triggered by a read that would be unexpected during a typical write. Getting closer though. – davidA Dec 02 '18 at 23:08
  • 1
    What about `mv_as_int = mv.cast(int)` and then `mv_as_int[address/4] = data`? – KT. Dec 03 '18 at 09:25
  • Brilliant - that works perfectly: `mv_as_int = mv.cast('I')` then `mv_as_int[address/4] = data` to write, and just `data = mv_as_int[address/4` to read. Both result in single memory accesses. Thank you for persevering with my question! Would you mind updating your answer to reflect this discussion (please leave original suggestion in too so the conversation makes sense) and I'll mark it as answered. – davidA Dec 03 '18 at 22:42
  • 2
    Cool. Nice to know it worked! It would be nice to understand the reasons behind this, but I wouldn't have the time to dive into the code (even finding it is nontrivial). Perhaps the byte-oriented memoryview decides to align the writes to 16-bits somehow, which the int-sized version avoids. Anyway, I updated the answer. – KT. Dec 03 '18 at 23:32
1

You could try something like this:

 def __init__(self,offset,size=0x10000):
    self.offset = offset
    self.size = size
    
    mmap_file = os.open('/dev/mem', os.O_RDWR | os.O_SYNC)
    mem = mmap.mmap(mmap_file, self.size,
                    mmap.MAP_SHARED,
                    mmap.PROT_READ | mmap.PROT_WRITE,
                    offset=self.offset)
    os.close(mmap_file)
    self.array = np.frombuffer(mem, np.uint32, self.size >> 2)
    
def wread(self,address):
    idx = address >> 2
    return_val = int(self.array[idx])
    return return_val
    
def wwrite(self,address,data):
    idx = address >> 2
    self.array[idx] = np.uint32(data)