Complete code here: https://gist.github.com/mnjul/82151862f7c9585dcea616a7e2e82033
Environment is Python 2.7.6 on an up-to-date Ubuntu 14.04 x64.
Prologue: Well, I got this strange piece of code at my work project, and it's a classic "somebody wrote it and quit the job, and it works but I don't why" piece, so I decided to write a stripped-down version of it, hoping to get my questions clarified/answered. Please kindly check the referred gist.
Situation:
So, I have a custom class Storage
inheriting from Python's thread local storage, intended to book-keep some thread-local data. There is only one instance of that class, instantiated in the global scope when no threads have been constructed. So I would expect that as there is only one Storage
instance, its __init__()
running only once, those Runner
threads would actually not have thread-local storage and data accesses will clash.
However this turned out to be wrong and the code output (see my comment at that gist) indicates that each thread actually perfectly has its own local storage --- strangely, at each thread's first access to the storage
object (i.e. a set()
), Storage.__init__()
is mysteriously run, thus properly creating the thread-local storage, producing the desired effect.
Questions: Why on earth did Storage.__init__
get invoked when the threads attempted to call a member function of a seemingly already-instantiated object? Is this a CPython (or PThread, if that matters) implementation detail? I feel like there're a lot of things happening between my stack trace's "py_thr_local.py", line 36, in run => storage.set('keykey', value)
and "py_thr_local.py", line 14, in __init__
, but I can't find any relevant piece of information in (C)Python's source code, or on the StackOverflow.
Any feedback is welcome. Let me know if I need to clarify things or provide more information.