I'm working on the multithreading implementation of a library. In one module of this library there are some global variables (very often used in the program execution). In order to make the access to those variables more safe, I declared them using the Thread-local storage (TLS) keyword __declspec(thread)
.
Here is the call to the library external function. This function uses the module with the global variables:
for(i = 0; i<n_cores; i++)
hth[i] = (HANDLE)_beginthread((void(*)(void*))MT_Interface_DimenMultiCells,0,(void*)&inputSet[i]);
In this way I guess all the variables used in the library will be duplicated for each thread.
When I run the program on a x8 cores processor, the time required to complete the operation doesn't go further than 1/3 the time needed for the single process implementation.
I know that it is impossible to reach 1/8 of the time, but i thought that at least 1/6 was reachable.
The question is: are those __declspec(thread)
variables the cause of so bad performances?