4

Background

I've encountered a problem that violates my conceptual model of position independent code and thread local storage. The problem that prompted this can be found in this StackOverflow post; I have a binary, which in turn dlopen's a shared object. Opening the shared object triggers an error stating dlopen: cannot load any more object with static TLS.

My understanding of this is that the initial-exec model is what is referred to as "static TLS" and that this is the often default when not creating position independent code. When one creates position independent code, the default is usually something else such as the global-dynamic model that GCC uses. I believed that the reason for this was because initial-exec cannot work in a shared object. An answer to Another StackOverflow post supported this belief, stating:

Linking non-fPIC code into a shared library is impossible on x86_64, but is allowed on ix86 (and leads to many subtle problems, like this one).

Given I am on an x86_64 machine, this has lead to some confusion. I then came across another StackOverflow question, where the answer appears to create a shared object using the static TLS model.

Upon seeing this, I decided to return to my problematic binary and recursively scan dependencies for the use of the static TLS model by looking at the output of readelf -d as per the answer to this question. To my surprise, I find a few libraries. To my dismay, they are not libraries built by the application.

Here is the output of readelf -d for one of them:

/lib64/libpthread.so.0

Dynamic section at offset 0x17d90 contains 29 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [ld-linux-x86-64.so.2]
 0x000000000000000e (SONAME)             Library soname: [libpthread.so.0]
 0x000000000000000c (INIT)               0x38c46052d0
 0x000000000000000d (FINI)               0x38c4611120
 0x0000000000000004 (HASH)               0x38c4615e90
 0x000000006ffffef5 (GNU_HASH)           0x38c4600280
 0x0000000000000005 (STRTAB)             0x38c4602dd8
 0x0000000000000006 (SYMTAB)             0x38c4600f00
 0x000000000000000a (STRSZ)              4918 (bytes)
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000003 (PLTGOT)             0x38c4817fe8
 0x0000000000000002 (PLTRELSZ)           1680 (bytes)
 0x0000000000000014 (PLTREL)             RELA
 0x0000000000000017 (JMPREL)             0x38c4604c38
 0x0000000000000007 (RELA)               0x38c4604578
 0x0000000000000008 (RELASZ)             1728 (bytes)
 0x0000000000000009 (RELAENT)            24 (bytes)
 0x000000006ffffffc (VERDEF)             0x38c46043a0
 0x000000006ffffffd (VERDEFNUM)          10
 0x000000000000001e (FLAGS)              STATIC_TLS
 0x000000006ffffffb (FLAGS_1)            Flags: NODELETE INITFIRST
 0x000000006ffffffe (VERNEED)            0x38c46044f8
 0x000000006fffffff (VERNEEDNUM)         2
 0x000000006ffffff0 (VERSYM)             0x38c460410e
 0x000000006ffffff9 (RELACOUNT)          60
 0x000000006ffffdf8 (CHECKSUM)           0x86f709c8
 0x000000006ffffdf5 (GNU_PRELINKED)      2018-05-23T11:25:00
 0x0000000000000000 (NULL)               0x0

Here we can see STATIC_TLS, which leads me to believe the initial-exec model has been used.

The output of readelf -l:

Elf file type is DYN (Shared object file)
Entry point 0x38c4605de0
There are 9 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x00000038c4600040 0x00000038c4600040
                 0x00000000000001f8 0x00000000000001f8  R E    8
  INTERP         0x0000000000011830 0x00000038c4611830 0x00000038c4611830
                 0x000000000000001c 0x000000000000001c  R      10
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x00000038c4600000 0x00000038c4600000
                 0x0000000000016df0 0x0000000000016df0  R E    200000
  LOAD           0x0000000000017b90 0x00000038c4817b90 0x00000038c4817b90
                 0x00000000000006e0 0x0000000000004860  RW     200000
  DYNAMIC        0x0000000000017d90 0x00000038c4817d90 0x00000038c4817d90
                 0x00000000000001f0 0x00000000000001f0  RW     8
  NOTE           0x0000000000000238 0x00000038c4600238 0x00000038c4600238
                 0x0000000000000044 0x0000000000000044  R      4
  GNU_EH_FRAME   0x000000000001184c 0x00000038c461184c 0x00000038c461184c
                 0x0000000000000a5c 0x0000000000000a5c  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     8
  GNU_RELRO      0x0000000000017b90 0x00000038c4817b90 0x00000038c4817b90
                 0x0000000000000470 0x0000000000000470  R      1

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_d .gnu.version_r .rela.dyn .rela.plt .init .plt .text __libc_freeres_fn .fini .rodata .interp .eh_frame_hdr .eh_frame .gcc_except_table .hash 
   03     .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 
   04     .dynamic 
   05     .note.gnu.build-id .note.ABI-tag 
   06     .eh_frame_hdr 
   07     
   08     .ctors .dtors .jcr .data.rel.ro .dynamic .got 

I was surprised by the lack of a TLS section here, but even so we have clear indications that a shared object is using the static initial-exec TLS model.

Finally, I've seen people with similar problems reordering dependencies to get rid of the earlier dlopen error. I'm not why that makes a difference.

Question(s)

  • How does a initial-exec function inside relocatable code, especially shared objects on x86_64?
  • Why does reordering dependencies sometimes resolve the dlopen issue; surely the number of slots used remains the same?

Any other suggestions for the original dlopen issue are also welcomed.

Update 1

Whilst digging around the problem some more, I came across another source stating static TLS models cannot be use in shared libraries:

DF_STATIC_TLS
    If set in a shared object or executable, this flag instructs the dynamic linker to reject attempts to load this file dynamically. It indicates that the shared object or executable contains code using a static thread-local storage scheme. Implementations need not support any form of thread-local storage.
OMGtechy
  • 7,935
  • 8
  • 48
  • 83
  • 3
    `initial-exec` is fine for shared objects. It tells the compiler and the linker that no symbol accessed will be found in another shared object not already present and loaded by `dlopen`. This means both the compiler and the linker can assume the modules they are compiling/linking are all there are and the TLS blocks layout is fixed and a GOT symbol can be used to access the thread-local symbol. I don't know how reordering would fix anything. – Margaret Bloom Jun 25 '20 at 20:32
  • 1
    @MargaretBloom thanks, that answers part of my question at least! :) Hopefully someone else will fill in on the rest. – OMGtechy Jun 26 '20 at 10:18
  • 2
    Read Drepper's [*How to write shared libraries*](https://www.akkadia.org/drepper/dsohowto.pdf) paper – Basile Starynkevitch Jun 26 '20 at 10:31
  • @MargaretBloom I am still reading Basile's suggestion, but in the meantime I came across another source that appears to contradict what you've said and what I'm seeing. Are you able to shed any light on it (see update 1)? Many thanks! – OMGtechy Jul 01 '20 at 11:46
  • 2
    @OMGtechy `DF_STATIC_TLS` tells the runtime to allocate the TLS block for the module immediately (instead of lazily, cfr. `__tls_get_addr`). This is allowed if the module is in the initial module set (i.e. it's a shared object, DSO, listed as a dependency in the ELF header). Chaper 3.1 of [ELF Handling For TLS](https://www.uclibc.org/docs/tls.pdf) actually seems to imply that `DF_STATIC_TLS` is also allowed in dynamically loaded DSOs, where it is ignored. Note that the static vs dynamic model is almost orthogonal wrt the access models (like `initial-exec`). The linked PDF is a must read ;) – Margaret Bloom Jul 01 '20 at 13:38
  • I said that "the static vs dynamic model is almost orthogonal wrt the access models" but that's a bit of an overstretch. For example the `initial-exec` model *needs* `DF_STATIC_TLS` since `__tls_get_addr` is not called and so so lazy allocation of the TLS block is possible. Actually, `initial-exec` modules require `DF_STATIC_TLS`. – Margaret Bloom Jul 01 '20 at 14:51

0 Answers0