A shared object, such as glibc, when compiled appropriately, defines many symbols, such as main_arena
, that are not normally used by other programs (although they can be seen in objdump
and gcc
), but are defined, with their addresses, as local symbols:
$ objdump -t ../.glibc/glibc_2.30_no-tcache/libc.so.6 | grep main_arena
00000000003b4b60 l O .data 0000000000000898 main_arena
Yet, when I reference one of these in C (via extern
), and attempt to link, the linker can't find it:
$ gcc -g -Og -no-pie -Wl,-rpath ../.glibc/glibc_2.30_no-tcache/ -Wl,--dynamic-linker=../.glibc/glibc_2.30_no-tcache/ld.so.2 s1.c -o s1
/usr/bin/ld: /tmp/ccjKyCNh.o: in function `printf':
/usr/include/x86_64-linux-gnu/bits/stdio2.h:112: undefined reference to `main_arena'
/usr/bin/ld: /usr/include/x86_64-linux-gnu/bits/stdio2.h:112: undefined reference to `main_arena'
collect2: error: ld returned 1 exit status
Note: I've updated this question with extensive research:
This is by design:
c language, global symbol, local symbol clarification "local (static): local symbols that are defined and referenced exclusively by module m.... These symbols are visible anywhere within module m, but cannot be referenced by other modules."
See also "Symbol Visibility Symbols can be categorized as local or global. Local symbols can not be referenced from an object other than the object that contains the symbol definition." https://docs.oracle.com/cd/E26505_01/html/E26506/chapter2-90421.html
and https://reverseengineering.stackexchange.com/questions/14895/why-are-symbols-with-local-binding-present-in-the-symbol-table-of-my-elf-files and http://web.cse.ohio-state.edu/~reeves.92/CSE2421au12/SlidesDay52.pdf
Nonetheless, for debugging, exploration, and reverse engineering, its sometimes desirable to reference an external local symbol defined in a shared object. All the information is there, as evidenced by gdb's ability to display it; its simply a flag that tells ld to not resolve symbols to it.
Given such, is it possible to tell ld to ignore the local flag, and resolve to the symbol anyway?
For example:
$ objdump -t ../.glibc/glibc_2.30_no-tcache/libc.so.6 | grep -E ' malloc$| main_arena$'
00000000003b4b60 l O .data 0000000000000898 main_arena
0000000000083500 g F .text 0000000000000213 malloc
$ man objdump 2>/dev/null | grep -A10 'flag characters'
The flag characters are divided into 7 groups as follows:
"l"
"g"
"u"
"!" The symbol is a local (l), global (g), unique global (u), neither global nor local (a space) or both global and
local (!). ...
I'd like to be able to write code that, for debugging and reverse engineering, references the symbol main_arena
regardless. How can I do this?
Update
I've read Employed Russian's excellent posts on related topics, and seen his reference to the XY Problem. With that in mind, let me ask my question X:
For exploratory purposes, I'd like to be able to look at the behavior of things like main_arena
, and other malloc internals, as I use malloc and free. I can do this with gdb. But I'd like to do this programaticaly, in C. One way to do this might have been to actually link to these symbols (question Y), but there's no reason to assume that's the best way, the only way, or even a viable way. Given that:
How can I inspect the value of local symbols in a shared library from within a different program, without having to drop to gdb?