Is there any caveats of this usage of thread_local
storage duration:
template <class T>
inline T &thread_local_get()
{
thread_local T t;
return t;
}
Then in different threads (for example)
thread_local_get<float>() += 1.f;
The doc at cppreference says this about thread local storage duration:
thread storage duration. The object is allocated when the thread begins and deallocated when the thread ends. Each thread has its own instance of the object. Only objects declared thread_local have this storage duration. thread_local can appear together with static or extern to adjust linkage.
Does this correctly allocate one thread_local
instance for each T (during compilation) and each calling thread ? Is there any situation that can lead to e.g undefined behavior ?