The behavior of cdll.LoadLibrary
is not something that Python controls, but it depends on the OS on which Python is running. @MarkTolonen answer shows the behavior on Windows, this answer concentrates on Linux (and MacOs).
cdll.LoadLibrary
uses dlopen
for loading of the shared objects, thus the behavior of dlopen
is what we need to analyze.
Let's take look at the following C-code (foo.c
):
#include <stdio.h>
static int init();
//global, initialized when so is loaded:
int my_global = init();
static int init(){
printf("initializing address %p\n", (void*)&my_global);
return 42;
}
extern "C" {
void set(int new_val){ my_global = new_val;}
int get() {return my_global;}
}
compiled with g++ --shared -fPIC foo.c -o foo.so
. I use C++ rather than C, so every time the global variable my_global
is initialized, it is logged to stdout (this would not be that straight forward in C).
After some preparation:
import ctypes
def init_functions(dll):
get = dll.get
get.argtypes = []
get.restype = ctypes.c_int
set = dll.set
set.argtypes = [ctypes.c_int]
set.restype = None
return get, set
we observe the following behavior:
#first load:
get, set = init_functions(ctypes.CDLL("./foo.so"))
# initializing address 0x7f5ca4a8102c
print(get()) # 42
set(21)
print(get()) # 21
so far as expected: global variable was initialized, and we can read/write it. Now second load:
get2, set2 = init_functions(ctypes.CDLL("./foo.so"))
Ups, we don's see logging of the initialization, that means...
print(get2()) # 21
the global variable was not initialized anew. That is the expected behavior of dlopen
: once a shared object is loaded, it is never reloaded but reused. This is the reason why e.g. pyximport
or %%cython
-magic use different names for resulting shared-objects.
To really create new version we copy the shared object foo.so
to foo.so.1
, and now:
# load copied shared object:
get3, set3 = init_functions(ctypes.CDLL("./foo.so.1"))
# initializing address 0x7f5ca487102c
print(get3()) # 42
We can see, the global variable was initialized, but it is another address, i.e. not the old variable, as can be easily checked:
print(get()) # 21 - still the old value.
Util now, the behavior is more or less the same on Windows and Linux, however on Linux we could use symbol interposition to ensure that the same global variable is used.
Normall CPython uses dlopen
with RTLD_LOCAL
, i.e. no interposition of symbols is used, by using RTLD_GLOBAL
, i.e.
...
#first load:
get, set = init_functions(ctypes.CDLL("./foo.so", mode=ctypes.RTLD_GLOBAL))
# initializing address 0x7fd01efc102c
...
mode=ctypes.RTLD_GLOBAL
will be ignored on Windows.
The first difference can be seen for
...
# load copied shared object:
get3, set3 = init_functions(ctypes.CDLL("./foo.so.1"))
# initializing address 0x7fd01efc102c
...
The same address as for "foo.so" is used also "foo.so.1" - due to RTLD_GLOBAL
the symbols n
from both shared objects were interposed.
And now:
print(get3()) # 42
print(get()) # 42
That means, the old global variable was reinitialized.
While funny, I would not recommend to depend on these details - it is just too clever, too brittle and not portable: it is really easy to introduce memory leaks or crashes.
One should accept, that in general, global variables cannot be reinitialized (in a portable way) by reloading a shared object.
Normally, one would like to avoid that global variables are interposed and ensure that they have internal linkage in the shared object (e.g. by making them static or using hidden
-attribute while compiling) and thus don't get interposed even if loaded with RTLD_GLOBAL
.