2

While analyzing the .bss section of a C++ program compiled as ELF file for the ARM platform, I came across several ways to determine the size. The four ways I tested are also mentioned in the question Tool to analyze size of ELF sections and symbol.

However, the results were quite different:

bss size according to nm:       35380
bss size according to readelf:  37632
bss size according to size:     37888
bss size according to objdump:  37594

What might be the reason for this?

Python script used to generate the output

#!/usr/bin/env python
import re
import subprocess
import sys

fname = sys.argv[1]

# nm
output = subprocess.check_output(['arm-none-eabi-nm','-l','-S','-C',fname])
size = 0
for line in output.splitlines():
    m = re.search('[0-9a-f]* ([0-9a-f]*) ([a-zA-Z]) ([^/]*)\s*([^\s]*)',line)
    if m:
        stype = m.group(2).strip()
        if stype in ['B','b']:
            size += int(m.group(1),16)

print "bss size according to nm: \t%i" % size

# readelf
output = subprocess.check_output(['arm-none-eabi-readelf','-S',fname])
for line in output.splitlines():
    m = re.search('bss\s+[A-Z]+\s+[0-9a-f]+ [0-9a-f]+ ([0-9a-f]+)',line)
    if m:
        print "bss size according to readelf: \t%i" % int(m.group(1),16)
        break

# size
output = subprocess.check_output(['arm-none-eabi-size',fname])
for line in output.splitlines():
    m = re.search('[0-9]+\s+[0-9]+\s+([0-9]+)',line)
    if m:
        print "bss size according to size: \t%i" % int(m.group(1))
        break

# objdump
output = subprocess.check_output(['arm-none-eabi-objdump','-C','-t','-j','.bss',fname])
size = 0
for line in output.splitlines():
    m = re.search('bss\s+([0-9a-f]*)\s+',line)
    if m:
        size += int(m.group(1),16)

print "bss size according to objdump: \t%i" % size

Edit: One thing I found out is the fact that nm classifies static variables inside of functions (correctly) as weak (V), though they might be part of the .bss. However, not all sections classified as V are part of the .bss, so I can not just add all V sections to the size. So is this task impossible with nm?

Community
  • 1
  • 1
koalo
  • 2,113
  • 20
  • 31
  • Have you tried `llvm-nm`, that comes bundled with LLVM/clang? Perhaps it won't agree with `nm` on this – valiano Sep 30 '17 at 18:09
  • Do you test object files (`ET_REL`) or fully linked objects (`ET_EXEC` and `ET_DYN`)? Why do you expect that summing up individual symbol sizes will give a correct number? – Florian Weimer Oct 01 '17 at 20:11
  • @valiano No, `llvm-nm` gives the same result. – koalo Oct 04 '17 at 10:48
  • @FlorianWeimer It is a ET_EXEC. And answering why summing up the individual symbol sizes results in a possibly incorrect result would be part of the answer. – koalo Oct 04 '17 at 10:52

1 Answers1

1

Here is an example assembler file which produces an executable which shows some of the things that can happen:

    .section .bss
    .globl var1
    .size var1, 1
var1:
    .skip 1

    .align 16777216
    .globl var2
    .size var2, 1048576
    .globl var3
    .size var3, 1048576
    .globl var4
    .size var4, 1048576
var2:
var3:
var4:
    .skip 1048576

    .text

    .globl main
main:
    xor %eax, %eax
    ret

size -x gives this output:

   text    data     bss     dec     hex filename
  0x5c9   0x220 0x2100000   34605033    21007e9 a.out

eu-readelf -S shows essentially the same information:

[25] .bss                 NOBITS       0000000001000000 01000000 02100000  0 WA     0   0 16777216

However, the symbol sizes, as shown by eu-readelf -s, are quite different:

   32: 0000000001000001       1 OBJECT  LOCAL  DEFAULT       25 completed.6963
   48: 0000000003000000 1048576 NOTYPE  GLOBAL DEFAULT       25 var2
   49: 0000000003000000 1048576 NOTYPE  GLOBAL DEFAULT       25 var4
   59: 0000000002000000       1 NOTYPE  GLOBAL DEFAULT       25 var1
   61: 0000000003000000 1048576 NOTYPE  GLOBAL DEFAULT       25 var3

The sum of their sizes is 0x300002, not 0x2100000. Two factors contribute to that:

  • There is a gap of about 16 MiB after var1 which is unused. It is needed to implement the alignment of var2, due to the order in which the variables are defined. Some of the space is reused by completed.6963.
  • var2, var3, var4 are aliases: the symbol values are the same, so there is just a single object backing the variable.

Furthermore, there is a substantial gap between the end of the .data section and the .bss section, caused by the .bss alignment requirement. With a typical dynamic loader, this will simply result in an unmapped region of memory:

  LOAD           0x000e08 0x0000000000200e08 0x0000000000200e08 0x000220 0x000220 RW  0x200000
  LOAD           0x1000000 0x0000000001000000 0x0000000001000000 0x000000 0x2100000 RW  0x200000

So size is presumably correct when it does not count this gap.

The numbers in this example are certainly excessive, but these effects are visible even with regular binaries, only to a lesser degree.

The readelf/size and objdump/nm differences could be an ARM peculiarity; these are probably triggered by certain symbol types not present in my example.

Florian Weimer
  • 32,022
  • 3
  • 48
  • 92
  • That is plausible, thanks! It is still strange that even those solutions that do not obviously add up symbols (i.e. readelf and size) give a different result, but it is a good answer anyway so I accept it. – koalo Oct 10 '17 at 12:37