For the purpose of concurrent/parallel GC, I'm interested in what memory order guarantee is provided by the mprotect syscall (i.e. the behavior of mprotect with multiple threads or the memory model of mprotect). My questions are (assuming no compilier reordering or with sufficient compiler barrier)
If thread 1 triggers a segfault on an address due to a mprotect on thread 2, can I be sure that everything happens on thread 2 before the syscall can be observed in thread 1 in the signal handler of the segfault? What if a full memory barrier is placed in the signal handler before performing load on thread1?
If thread 1 does an volatile load on an address that is set to PROT_NONE by thread 2 and didn't trigger a segfault, is this enough of a happens before relation between the two. Or in another word, if the two threads do (
*ga
starts as0
,p
is a page aligned address started readonly)// thread 1 *ga = 1; *(volatile int*)p; // no segfault happens // thread 2 mprotect(p, 4096, PROT_NONE); // Or replace 4096 by the real userspace-visible page size a = *ga;
is there a guarantee that
a
on thread 2 will be1
? (assuming no segfault observed on thread 1 and no other code modifies*ga
)
I'm mostly interested in Linux behavior and particularly on x86(_64), arm/aarch64 and ppc though information about other archs/OS are welcome to (for windows, replace mprotect by VirtualProtect or whatever it is called....). So far my tests on x64 and aarch64 Linux suggests no violations of these though I'm not sure if my test is conclusive or if the behavior can be relied on in the long term.
Some searching suggests that mprotect
may issue a TLB shootdown on all threads with the address mapped when permission is removed which might provide the guarantee stated here (or in another word, providing this guarantee seems to be the goal of such operation) though it's unclear to me if future optimization of the kernel code could break this guarentee.
Ref LKML post where I asked about this a week ago with no reply yet...
Edit: clearification about the question. I was aware that a tlb shootdown should provide the guarantee I'm looking for but I'd like to know if such a behavior can be relied on. In another word, what's the reason such requests are issued by the kernel since it shouldn't be needed if not for providing some kind of ordering guarantee.