Suppose I have code such as
#include <stdint.h>
namespace
{
struct Thing {
uint8_t a;
uint8_t b;
uint8_t c;
uint8_t d;
};
auto& magicRegister = *reinterpret_cast<volatile uint8_t*>(0x1234);
auto magicMemory = reinterpret_cast<Thing*>(0x2000); // 0x2000 - 0x20ff
}
int main()
{
magicMemory[0] = { .a = 1, .b = 2, .c = 3, .d = 4 };
magicRegister = 0xff; // Transfer
magicMemory[0].d = 5;
magicRegister = 0xff; // Transfer
}
in which writing 0xff
to the byte located in address 0x1234
makes the system copy 256 bytes from the region starting at 0x2000
to an inaccessible memory. The contents of the memory region only matter when the transfer takes place.
How can I guarantee that GCC's optimization won't mess up the order of operations?
I see a few options.
- Make
magicMemory
volatile: I dislike this, because the contents of the memory region don't matter until the transfer is initiated. - Add
asm("" ::: "memory");
before each transfer: simple and effective, but may prevent some unrelated code from being properly optimized.
int main()
{
magicMemory[0] = { .a = 1, .b = 2, .c = 3, .d = 4 };
asm("" ::: "memory");
magicRegister = 0xff; // Transfer
magicMemory[0].d = 5;
asm("" ::: "memory");
magicRegister = 0xff; // Transfer
}
- Same as #2, but add
asm(""
:
: "m"(*reinterpret_cast<uint8_t*>(0x2000)),
"m"(*reinterpret_cast<uint8_t*>(0x2001)),
"m"(*reinterpret_cast<uint8_t*>(0x2002)),
// ...
"m"(*reinterpret_cast<uint8_t*>(0x20fd)),
"m"(*reinterpret_cast<uint8_t*>(0x20fe)),
"m"(*reinterpret_cast<uint8_t*>(0x20ff)));
instead: stupid, but technically should work. Sadly, it crashes the GCC port I'm using.
EDIT: What if I also had this:
uint8_t unrelatedVariable;
int main()
{
unrelatedVariable = 5; // This is redundant and should be optimized away
// same as before
unrelatedVariable = 6;
}
It is possible to fix the ordering of accesses to magicRegister
and magicMemory
while not affecting unrelatedVariable
?