I'm working on a runtime library that uses user-level context switching (using Boost::Context), and am having trouble using thread_level
variables. Consider the following (reduced) code:
thread_local int* volatile tli;
int main()
{
tli = new int(1); // part 1, done by thread 1
UserLevelContextSwitch();
int li = *tli; // part 2, done by thread 2
cout << li;
}
Since there are two accesses to the thread_local
variable, the main function is transformed by the compiler to something along these lines (reversed from assembly):
register int** ptli = &tli; // cache address of thread_local variable
*ptli = new int(1);
UserLevelContextSwitch();
int li = **ptli;
cout << li;
This seems to be a legal optimization, since the value of volatile tli
is not being cached in a register. But the address of the volatile tli
is in fact being cached, and not read from memory on part 2.
And that's the problem: after the user-level context switch, the thread that did part 1 goes somewhere else. Part 2 is then picked up by some other thread, which gets the previous stack and registers state. But now the thread that's executing part 2 reads the value of the tli
that belongs to thread 1.
I'm trying to figure out a way to prevent the compiler from caching the thread-local variable's address, and volatile
doesn't go deep enough. Is there any trick (preferably standard, possibly GCC-specific) to prevent the caching of the thread-local variables' addresses?