5
// foo.c
int main() { return 0; }

When I compiled the code above I noticed some symbols located in *ABS*:

$ gcc foo.c
$ objdump -t a.out | grep ABS
0000000000000000 l    df *ABS*  0000000000000000              crtstuff.c
0000000000000000 l    df *ABS*  0000000000000000              foo.c
0000000000000000 l    df *ABS*  0000000000000000              crtstuff.c
0000000000000000 l    df *ABS*  0000000000000000              

Looks like they're some debug symbols but isn't debug info are stored in somewhere like .debug_info section?

According to man objdump:

*ABS* if the section is absolute (ie not connected with any section)

I don't understand it since no example is given here.

Question here shows an interesting way to pass some extra symbols in *ABS* by --defsym. But I think it could be easier by passing macros.

So what is this *ABS* section and when would someone use it?

EDIT:

Absolute symbols don't get relocated, their virtual addresses (0000000000000000 in the example you gave) are fixed.

I wrote a demo but it seems that the addresses of absolute symbols can be modified.

// foo.c

#include <stdio.h>

extern char foo;

int main()
{
  printf("%p\n", &foo);
  return 0;
}
$ gcc foo.c -Wl,--defsym,foo=0xbeef -g

$ objdump -t a.out | grep ABS
0000000000000000 l    df *ABS*  0000000000000000              crtstuff.c
0000000000000000 l    df *ABS*  0000000000000000              foo.c
0000000000000000 l    df *ABS*  0000000000000000              crtstuff.c
0000000000000000 l    df *ABS*  0000000000000000
000000000000beef g       *ABS*  0000000000000000              foo

# the addresses are not fixed
$ ./a.out
0x556e06629eef
$ ./a.out
0x564f0d7aeeef
$ ./a.out
0x55c2608dceef

# gdb shows that before entering main(), &foo == 0xbeef
$ gdb a.out
(gdb) p &foo
$1 = 0xbeef <error: Cannot access memory at address 0xbeef>
(gdb) br main
Breakpoint 1 at 0x6b4: file foo.c, line 7.
(gdb) r
Starting program: /home/user/a.out

Breakpoint 1, main () at foo.c:7
7         printf("%p", &foo);
(gdb) p &foo
$2 = 0x55555555feef <error: Cannot access memory at address 0x55555555feef>
whuala
  • 115
  • 2
  • 13
  • `/a.out 0x556e06629eef` -- good observation. I am also curious; looking into it now I've found out 2 things so far. 1) the different results you get are the result of ASLR done by the kernel. 2) If you link the program statically (`gcc -static ...`), you'll see `0xbeef` unmodified in the output. So ISTM the dynamic loader still shifts everything by a base address, including absolute symbols... – Vladislav Ivanishin May 08 '19 at 13:17

1 Answers1

2

If you look at other symbols you might find an index (or section name if the reader does the mapping for you) in place of *ABS*. This is a section index in the section headers table. It points to the section header of a section the symbol is defined in (or SHN_UNDEF (zero) if it is undefined in the object you are looking at). So the value (virtual address) of a symbol will be adjusted by the same value its containing section is adjusted during loading. (This process is called relocation.) Not so for absolute symbols (having special value SHN_ABS as their st_shndx). Absolute symbols don't get relocated, their virtual addresses (0000000000000000 in the example you gave) are fixed.

Such absolute symbols are sometimes used to store some meta information. In particular, the compiler can create symbols with symbol names equivalent to the names of translation units it compiles. Such symbols aren't needed for linking or running the program, they are just for humans and binary processing tools.

As for your question w.r.t the reason this isn't stored in .debug_info section (and why this info is emitted even though no debug switches were specified), the answer is that it is a separate thing; it is just the symbol table (.symtab). It is also needed for debugging, sure, but it's primary purpose is linking of object (.o) files. By default it is preserved in linked executables/libraries. You can get rid of it with strip.

Much of what I wrote here is in man 5 elf.


I don't think doing what you are doing (with --defsym) is supported/supposed to work with dynamic linking. Looking at the compiler output (gcc -S -masm=intel), I see this

lea     rsi, foo[rip]

Or, if we look at objdump -M intel -rD a.out (linking with -q to preserve relocations), we see the same thing: rip-relative addressing is used to get the address of foo.

113d:       48 8d 35 ab ad 00 00    lea    rsi,[rip+0xadab]        # beef <foo>
                    1140: R_X86_64_PC32     foo-0x4

The compiler doesn't know that it's going to be an absolute symbol, so it produces the code it does (as for a normal symbol). rip is the instruction pointer, so it depends on the base address of the segment containing .text after the program is mapped into memory by ld.so.

I found this answer shedding light on the proper use-case for absolute symbols.

Vladislav Ivanishin
  • 2,092
  • 16
  • 22
  • Thanks! But I'm still confused about the addresses of absolute symbols. Since it's hard to fit into comments I edited the question. – whuala May 08 '19 at 12:29