I use memcpy()
to write data to a device, with a logic analyzer/PCIe analyzer, I can see the actual stores.
My device gets more stores than expected.
For example,
auto *data = new uint8_t[1024]();
for (int i=0; i<50; i++){
memcpy((void *)(addr), data, i);
}
For i=9, I see these stores:
- 4B from byte 0 to 3
- 4B from byte 4 to 7
- 3B from byte 5 to 7
- 1B-aligned only, re-writing the same data -> inefficient and useless store
- 1B the byte 8
In the end, all the 9 Bytes are written but memcpy
creates an extra store of 3B re-writing what it has already written and nothing more.
Is it the expected behavior? The question is for C and C++, I'm interested in knowing why this happens, it seems very inefficient.