-1

I have an ELF file which we then convert to a binary format:

arm-none-eabi-objcopy -O binary MyElfFile.elf MyBinFile.bin

The ELF file is just under 300KB, but the binary output file is 446-times larger: 134000KB, or 130MB! How is this possible when the whole point of a binary is to remove symbols and section tables and debug info?

Looking at Reddit and SO it looks like the binary image should be smaller than the ELF, not larger.

Gregory Fenn
  • 460
  • 2
  • 13
  • 2
    Look at the sections in the elf file. You will have something like 'isr@0:4k' and 'code@128M:128k'. There is a big hole in-between the two sections. For a binary, there are no holes and it is filled with zeros. You need to make sure all **allocated** sections are contiguous and have code copy them from the load address to the run address (there are other possible explanations). Try to post `objdump -h` of the elf. You need to pay attention to the section flags. It will explain why. – artless noise Oct 12 '22 at 12:41
  • size of one thing has nothing to do with the other for the specific -O binary file format. big difference between the amount of the loadable sections (which actually could also be larger than the elf) and the -O binary output – old_timer Oct 12 '22 at 17:22
  • one of those examples is intel hex, -O ihex, not -O binary. and that is a different file format than the objcopy -O binary format. Technically it could be larger than the elf file since it is ascii and in the elf file the binary blobs are in binary and in the intel hex file and also motorola srec file they are ascii hex, so it takes two ascii characters, two bytes, for each byte of raw data, make the binary size relative to the elf overhead size the right amount and the -O ihex and/or the -O srec will be larger than the original .elf – old_timer Oct 12 '22 at 21:08

1 Answers1

0

so.s

b .

.section .data
.word 0x12345678

arm-none-eabi-as so.s -o so.o
arm-none-eabi-objdump -D so.o

so.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <.text>:
   0:   eafffffe    b   0 <.text>

Disassembly of section .data:

00000000 <.data>:
   0:   12345678    eorsne  r5, r4, #120, 12    ; 0x78

arm-none-eabi-readelf -a so.o
Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 000034 000004 00  AX  0   0  4
  [ 2] .data             PROGBITS        00000000 000038 000004 00  WA  0   0  1
  [ 3] .bss              NOBITS          00000000 00003c 000000 00  WA  0   0  1
  [ 4] .ARM.attributes   ARM_ATTRIBUTES  00000000 00003c 000012 00      0   0  1
  [ 5] .symtab           SYMTAB          00000000 000050 000060 10      6   6  4
  [ 6] .strtab           STRTAB          00000000 0000b0 000004 00      0   0  1
  [ 7] .shstrtab         STRTAB          00000000 0000b4 00003c 00      0   0  1

so my "binary" has 8 bytes total. In two sections.

-rw-rw-r-- 1 oldtimer oldtimer  560 Oct 12 16:32 so.o

8 bytes relative to 560 for the object.

Link it.

MEMORY
{
    one : ORIGIN = 0x00001000, LENGTH = 0x1000
    two : ORIGIN = 0x00002000, LENGTH = 0x1000
}
SECTIONS
{
    .text   : { (.text)   } > one
    .data   : { (.data)   } > two
}


arm-none-eabi-ld -T so.ld so.o -o so.elf
arm-none-eabi-objdump -D so.elf

so.elf:     file format elf32-littlearm


Disassembly of section .text:

00001000 <.text>:
    1000:   eafffffe    b   1000 <.text>

Disassembly of section .data:

00002000 <.data>:
    2000:   12345678    eorsne  r5, r4, #120, 12    ; 0x7800000

arm-none-eabi-readelf -a so.elf

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00001000 001000 000004 00  AX  0   0  4
  [ 2] .data             PROGBITS        00002000 002000 000004 00  WA  0   0  1
  [ 3] .ARM.attributes   ARM_ATTRIBUTES  00000000 002004 000012 00      0   0  1
  [ 4] .symtab           SYMTAB          00000000 002018 000070 10      5   7  4
  [ 5] .strtab           STRTAB          00000000 002088 00000c 00      0   0  1
  [ 6] .shstrtab         STRTAB          00000000 002094 000037 00      0   0  1

Now...we need 4 bytes at 0x1000 and 4 bytes at 0x2000, if we want to use the -O binary objcopy that means it is going to take the entire memory space and start the file with the lowest address thing and end with the highest address thing. With this link the lowest thing is 0x1000 and highest is 0x2003, a total span of 0x1004 bytes:

arm-none-eabi-objcopy -O binary so.elf so.bin
ls -al so.bin
-rwxrwxr-x 1 oldtimer oldtimer 4100 Oct 12 16:40 so.bin

4100 = 0x1004 bytes

hexdump -C so.bin
00000000  fe ff ff ea 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000  78 56 34 12                                       |xV4.|
00001004

The assumption here is the user knows that the base address is 0x1000 as there is no address info in the file format. And that this is a continuous memory image so that the four bytes also land at 0x2000. So -O binary pads the file to fill everything in.

If I change to this

MEMORY
{
    one : ORIGIN = 0x00000000, LENGTH = 0x1000
    two : ORIGIN = 0x10000000, LENGTH = 0x1000
}
SECTIONS
{
    .text : { *(.text*) } > one
    .data : { *(.data*) } > two
}

You can easily see where this is headed.

ls -al so.bin
-rwxrwxr-x 1 oldtimer oldtimer 268435460 Oct 12 16:43 so.bin

So my elf does not change size, but the -O binary format is 0x10000004 bytes in size, there are only 8 bytes I care about but the nature of objcopy -O binary has to pad the middle.

Since the sizes and spaces of things vary specific to your project and your linker script, no generic statements can be made relative to the size of the elf file and the size of an -O binary file.

 ls -al so.elf
-rwxrwxr-x 1 oldtimer oldtimer 131556 Oct 12 16:49 so.elf
 arm-none-eabi-strip so.elf
 ls -al so.elf
-rwxrwxr-x 1 oldtimer oldtimer 131336 Oct 12 16:50 so.elf
 arm-none-eabi-as -g so.s -o so.o
 ls -al so.o
-rw-rw-r-- 1 oldtimer oldtimer 1300 Oct 12 16:51 so.o
 arm-none-eabi-ld -T so.ld so.o -o so.elf
 ls -al so.elf
-rwxrwxr-x 1 oldtimer oldtimer 132088 Oct 12 16:51 so.elf
arm-none-eabi-strip so.elf
ls -al so.elf
-rwxrwxr-x 1 oldtimer oldtimer 131336 Oct 12 16:52 so.elf

The elf binary file format does not have absolute rules on content, the consumer of the file can have rule as to what you have to put where, if any specific names of items have to be there, etc. It is a somewhat open file format, it is a container like a cardboard box, and you can fill it to some extent how you like. You cannot fit a cruise ship in it, but you can put books or toys and you can choose how you put the books or toys in it sometimes.

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 010000 000004 00  AX  0   0  4
  [ 2] .data             PROGBITS        10000000 020000 000004 00  WA  0   0  1
  [ 3] .ARM.attributes   ARM_ATTRIBUTES  00000000 020004 000012 00      0   0  1
  [ 4] .shstrtab         STRTAB          00000000 020016 000027 00      0   0  1

Even after stripping there is still extra stuff there, if you study the file format you have a header, relatively small with number of program headers and number of section headers and then that many program headers and that many section headers. Depending on the consumer(s) of the file you may for example only need the main header stuff and two program headers in this case and that is it, a much smaller file (as you can see with the object version of the file).

arm-none-eabi-as so.s -o so.o
ls -al so.o
-rw-rw-r-- 1 oldtimer oldtimer 560 Oct 12 16:57 so.o
arm-none-eabi-strip so.o
ls -al so.o
-rw-rw-r-- 1 oldtimer oldtimer 364 Oct 12 16:57 so.o

readelf that

  Size of this header:               52 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           40 (bytes)
  Number of section headers:         6

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 000034 000004 00  AX  0   0  4
  [ 2] .data             PROGBITS        00000000 000038 000004 00  WA  0   0  1
  [ 3] .bss              NOBITS          00000000 00003c 000000 00  WA  0   0  1
  [ 4] .ARM.attributes   ARM_ATTRIBUTES  00000000 00003c 000012 00      0   0  1
  [ 5] .shstrtab         STRTAB          00000000 00004e 00002c 00      0   0  1

Extra section headers we don't need which maybe can be removed in the linker script. But I assume for some consumers all you would need is the two program headers

  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         2

Plus the 8 bytes and any padding for this file format.

Also note

arm-none-eabi-objcopy --only-section=.text -O binary so.elf text.bin
arm-none-eabi-objcopy --only-section=.data -O binary so.elf data.bin
ls -al text.bin
-rwxrwxr-x 1 oldtimer oldtimer 4 Oct 12 17:03 text.bin
ls -al data.bin
-rwxrwxr-x 1 oldtimer oldtimer 4 Oct 12 17:03 data.bin
hexdump -C text.bin
00000000  fe ff ff ea                                       |....|
00000004
hexdump -C data.bin
00000000  78 56 34 12                                       |xV4.|
00000004
halfer
  • 19,824
  • 17
  • 99
  • 186
old_timer
  • 69,149
  • 8
  • 89
  • 168
  • It is just a file format thing, each file format is different from another. Different rules, different overhead, different amount of information normal or possible, debug information is not in -O binary but that also does not mean it is smaller. Much less info in -O ihex and -O srec than in an elf file, but they can be larger than the elf. It is just about file formats – old_timer Oct 12 '22 at 21:10