Background
I've encountered a problem that violates my conceptual model of position independent code and thread local storage. The problem that prompted this can be found in this StackOverflow post; I have a binary, which in turn dlopen
's a shared object. Opening the shared object triggers an error stating dlopen: cannot load any more object with static TLS
.
My understanding of this is that the initial-exec
model is what is referred to as "static TLS" and that this is the often default when not creating position independent code. When one creates position independent code, the default is usually something else such as the global-dynamic
model that GCC uses. I believed that the reason for this was because initial-exec
cannot work in a shared object. An answer to Another StackOverflow post supported this belief, stating:
Linking non-fPIC code into a shared library is impossible on x86_64, but is allowed on ix86 (and leads to many subtle problems, like this one).
Given I am on an x86_64 machine, this has lead to some confusion. I then came across another StackOverflow question, where the answer appears to create a shared object using the static TLS model.
Upon seeing this, I decided to return to my problematic binary and recursively scan dependencies for the use of the static TLS model by looking at the output of readelf -d
as per the answer to this question. To my surprise, I find a few libraries. To my dismay, they are not libraries built by the application.
Here is the output of readelf -d
for one of them:
/lib64/libpthread.so.0
Dynamic section at offset 0x17d90 contains 29 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [ld-linux-x86-64.so.2]
0x000000000000000e (SONAME) Library soname: [libpthread.so.0]
0x000000000000000c (INIT) 0x38c46052d0
0x000000000000000d (FINI) 0x38c4611120
0x0000000000000004 (HASH) 0x38c4615e90
0x000000006ffffef5 (GNU_HASH) 0x38c4600280
0x0000000000000005 (STRTAB) 0x38c4602dd8
0x0000000000000006 (SYMTAB) 0x38c4600f00
0x000000000000000a (STRSZ) 4918 (bytes)
0x000000000000000b (SYMENT) 24 (bytes)
0x0000000000000003 (PLTGOT) 0x38c4817fe8
0x0000000000000002 (PLTRELSZ) 1680 (bytes)
0x0000000000000014 (PLTREL) RELA
0x0000000000000017 (JMPREL) 0x38c4604c38
0x0000000000000007 (RELA) 0x38c4604578
0x0000000000000008 (RELASZ) 1728 (bytes)
0x0000000000000009 (RELAENT) 24 (bytes)
0x000000006ffffffc (VERDEF) 0x38c46043a0
0x000000006ffffffd (VERDEFNUM) 10
0x000000000000001e (FLAGS) STATIC_TLS
0x000000006ffffffb (FLAGS_1) Flags: NODELETE INITFIRST
0x000000006ffffffe (VERNEED) 0x38c46044f8
0x000000006fffffff (VERNEEDNUM) 2
0x000000006ffffff0 (VERSYM) 0x38c460410e
0x000000006ffffff9 (RELACOUNT) 60
0x000000006ffffdf8 (CHECKSUM) 0x86f709c8
0x000000006ffffdf5 (GNU_PRELINKED) 2018-05-23T11:25:00
0x0000000000000000 (NULL) 0x0
Here we can see STATIC_TLS
, which leads me to believe the initial-exec
model has been used.
The output of readelf -l
:
Elf file type is DYN (Shared object file)
Entry point 0x38c4605de0
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x00000038c4600040 0x00000038c4600040
0x00000000000001f8 0x00000000000001f8 R E 8
INTERP 0x0000000000011830 0x00000038c4611830 0x00000038c4611830
0x000000000000001c 0x000000000000001c R 10
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x00000038c4600000 0x00000038c4600000
0x0000000000016df0 0x0000000000016df0 R E 200000
LOAD 0x0000000000017b90 0x00000038c4817b90 0x00000038c4817b90
0x00000000000006e0 0x0000000000004860 RW 200000
DYNAMIC 0x0000000000017d90 0x00000038c4817d90 0x00000038c4817d90
0x00000000000001f0 0x00000000000001f0 RW 8
NOTE 0x0000000000000238 0x00000038c4600238 0x00000038c4600238
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x000000000001184c 0x00000038c461184c 0x00000038c461184c
0x0000000000000a5c 0x0000000000000a5c R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 8
GNU_RELRO 0x0000000000017b90 0x00000038c4817b90 0x00000038c4817b90
0x0000000000000470 0x0000000000000470 R 1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_d .gnu.version_r .rela.dyn .rela.plt .init .plt .text __libc_freeres_fn .fini .rodata .interp .eh_frame_hdr .eh_frame .gcc_except_table .hash
03 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.gnu.build-id .note.ABI-tag
06 .eh_frame_hdr
07
08 .ctors .dtors .jcr .data.rel.ro .dynamic .got
I was surprised by the lack of a TLS section here, but even so we have clear indications that a shared object is using the static initial-exec
TLS model.
Finally, I've seen people with similar problems reordering dependencies to get rid of the earlier dlopen
error. I'm not why that makes a difference.
Question(s)
- How does a
initial-exec
function inside relocatable code, especially shared objects on x86_64? - Why does reordering dependencies sometimes resolve the
dlopen
issue; surely the number of slots used remains the same?
Any other suggestions for the original dlopen
issue are also welcomed.
Update 1
Whilst digging around the problem some more, I came across another source stating static TLS models cannot be use in shared libraries:
DF_STATIC_TLS
If set in a shared object or executable, this flag instructs the dynamic linker to reject attempts to load this file dynamically. It indicates that the shared object or executable contains code using a static thread-local storage scheme. Implementations need not support any form of thread-local storage.