thread-local storage overhead

Question

Assume there is some not-reentrant function that uses global variables:


int i;
void foo(void){
/* modify i */
}

And then, I want to use this function in multithreaded code, so I can change code this way:


void foo(int i){
/* modify i */
}

or, by using gcc __thread specifier, more simplier:


__thread int i;
void foo(void){
/* modify i */
}

Advantages of the last is that I don't need to change another code which call foo().

My questions is, how much overhead of thread-local storage is? Is there some not obvious issues with TLS?

Is there some overhead if I will modify TLS`ed variable via separate pointer, like this:


__thread int i;
void foo(void){
int *p = &i;
/* modify i using p pointer */
}

Thanks.

Jon · Accepted Answer · 2011-03-27T21:46:21.803

And then, I want to use this function in multithreaded code, so I can change code this way:

void foo(int i){
    /* modify i */
}

This will certainly not work, as you will only be modifying a copy of i. You 'd need to pass an int* or int& instead if you want the changes to stick.

Using TLS will certainly not cause any significant overhead (either in space or time) over any custom approach you might follow to implement the same functionality. Compilers in practice implement TLS by dynamically allocating a storage "slot" in a global data structure that holds your thread-local variable.

When you access a thread-local variable at runtime, there is an extra level of indirection: first the runtime has to access the appropriate thread-local variable table for the current thread, and then to fetch the value out of the table. This fetching is done using an index into the array (which is an O(1) operation).

If you intend to do this:

__thread int i;
void foo(void){
    int *p = &i;
    /* modify i using p pointer */
}

then there is no need to access i using a pointer. Think of i as a global variable that has a different value for each running thread. You wouldn't need to access a normal global through a pointer to make changes stick, so there's also no need to use a pointer with a thread-local variable.

Finally, thread-local storage is not really meant to store large numbers of variables per thread (there are compiler-dependent limits on the size of the TLS table) but this is something you can easily work around: put many variables into a struct and make a pointer to the struct thread-local.

You are right, there is also another problems with this example, and its a reason why I want to use TLS for it. — S.J., Mar 27 '11 at 16:59
This isn't entirely true. TLS does have significant cumulative overhead, in that increasing the amount of TLS increases the *cost of thread creation* for each thread, even threads that will never use the code that needs the TLS. Also your workaround does not work. If there's a size limit on TLS, putting data in a `struct` will not reduce the size of it. Perhaps you were thinking of putting a pointer in TLS and allocating the `struct` with `malloc`? — R.. GitHub STOP HELPING ICE, Mar 27 '11 at 19:04
@R..: I have no experience in massively threaded apps and therefore can't really say about the size of the thread creation cost. However, surely apps that are expected to launch big numbers of threads during their lifetime would turn to a solution with less overhead such as a thread pool? Regarding the `struct` in the TLS yes, that was the idea -- bad explanation though. Thanks for the catch, I rewrote it correctly. — Jon, Mar 27 '11 at 21:50
Using thread pools is a lot of added complexity (and a big turn-off to using threads, for many people) for the sake of accommodating *bad legacy implementations* where thread creation is slow. The situation where POSIX threads really *shine* is when you can create threads at will as part of library code without the caller even knowing or caring about it (much less having to manage it all in ugly global state like thread pools). — R.. GitHub STOP HELPING ICE, Mar 27 '11 at 22:21

score 1 · Answer 2 · answered Mar 27 '11 at 17:01

1

The only problem I see with TLS is its possible limited size. It depends on the system, so you can face porting or scaling problems (BTW, TLS may be not available at all on some systems)

answered Mar 27 '11 at 17:01

Giuseppe Guerrini

4,274
17
32

thread-local storage overhead

2 Answers2

Linked