I am really confused what does the keyword of gcc __thread
do behind the back.
Could any one give me some information?
I am really confused what does the keyword of gcc __thread
do behind the back.
Could any one give me some information?
It declares a variable as being thread local much in the same way as the C++11 thread_local
keyword, the important difference being that thread_local
allows static initialization whereas __thread
does not.
The ability of doing static initialization may have a noticeable performance impact (function call per access) as explained in this answer.
Thread local means that every thread accessing this variable will see a separate, different variable, as if they were indeed variables with different names (though they are the same name on a source level).
What exactly the compiler does to implement this is implementation-defined and platform dependant. Typical implementations (different between different versions of GCC) include calling e.g. get_thread_area
under Linux and TlsAlloc
/TlsGetValue
under Windows, which has considerable overhead either only on first access (Linux) but is "free" otherwise, or a noticeable overhead on every access (Windows).
Alternatives include obtaining a pointer from the thread environment block and doing a table lookup (that is what the TlsGetValue
function does internally, too) or having a separate writeable static data segment which is selected per-thread upon access (this has been what was done to implement e.g. errno
for many years), or simply reserving a small region at the bottom of the stack at program start, as anything on the stack is by definition thread-local.
Which method exactly your particular compiler version uses is something you can only know by compiling and disassemblig a program (or by digging through the compiler sources).
The overhead that you will encounter may range anywhere from "just two memory acesses instead of one" to a function call followed by one or two dozen instructions, or even a syscall in the worst case.