13

I want to link in raw binary data. I'd like to either put it at a particular address, or have it link to a symbol (char* mydata, for instance) I have defined in code. Since it's not an obj file, I can't simply link it in.

A similar post (Include binary file with GNU ld linker script) suggests using objcopy with the -B bfdarch option. objcopy responds with "archictecture bfdarch unknown".

Yet another answer suggests transforming the object into a custom LD script and then include that from the main LD script. At this point, I may as well just be using a C include file (which is what I am doing Now) so I'd rather not do that.

Can I use objcopy to accomplish this, or is there another way?

artless noise
  • 21,212
  • 6
  • 68
  • 105
Brian
  • 6,910
  • 8
  • 44
  • 82
  • 3
    How'bout `unsigned char data[] = { 0x12, 0x34, 0x56, 0x78, ... };` instead? –  Jun 23 '13 at 22:36
  • H2CO3, that's pretty much what I am doing. I have a pre-build command to read in the bin file, and spit out an H file with an array of data (like you suggest). It works, but it seems like there should be a better way of doing it. – Brian Jun 23 '13 at 23:38
  • 1
    `bfdarch` is not literal; Try `arm`. This definitely works as does using *gas*, it has a [*.incbin* directive](http://linux.web.cern.ch/linux/scientific4/docs/rhel-as-en-4/incbin.html). There are also a plethora of programs such as `hexdump` that convert binary to 'C' arrays. – artless noise Jun 24 '13 at 01:11
  • 1
    you can just use `objcopy`. See my answer. – FrankH. Jun 24 '13 at 08:42
  • Also see https://stackoverflow.com/questions/7757834/how-to-obtain-the-bfd-architecture-specification-for-the-current-platform – FrankH. Jul 15 '21 at 07:30

3 Answers3

20

The following example works for me:

$ dd if=/dev/urandom of=binblob bs=1024k count=1
$ objcopy -I binary -O elf32-little binblob binblob.o
$ file binblob.o
binblob.o: ELF 32-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ nm  -S -t d binblob.o
0000000001048576 D _binary_binblob_end
0000000001048576 A _binary_binblob_size
0000000000000000 D _binary_binblob_start

I.e. no need to specify the BFD arch for binary data (it's only useful / necessary for code). Just say "the input is binary", and "the output is ...", and it'll create you the file. Since pure binary data isn't architecture-specific, all you need to tell it is whether the output is 32bit (elf32-...) or 64bit (elf64-...), and whether it's little endian / LSB (...-little, as on ARM/x86) or big endian / MSB (...-big, as e.g. on SPARC/m68k).

Edit: Clarification on the options for objcopy:

  • the usage of the -O ... option controls:
    • bit width (whether the ELF file will be 32-bit or 64-bit)
    • endianness (whether the ELF file will be LSB or MSB)
  • the usage of the -B ... option controls the architecture the ELF file will request

You have to specifiy the -O ... but the -B ... is optional. The difference is best illustrated by a little example:

$ objcopy -I binary -O elf64-x86-64 foobar foobar.o
$ file foobar.o
foobar.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped

$ objcopy -I binary -O elf64-x86-64 -B i386 foobar foobar.o
$ file foobar.o
foobar.o: ELF 64-bit LSB relocatable, AMD x86-64, version 1 (SYSV), not stripped

I.e. just the output format specifier elf64-x86-64 doesn't tie the generated binary to a specific architecture (that's why file says no machine). The usage if -B i386 does so - and in that case, you're told this is now AMD x86-64.

The same would apply to ARM; -O elf32-little vs. -O elf32-littlearm -B arm is that in the former case, you end up with a ELF 32-bit LSB relocatable, no machine, ... while in the latter, it'll be an ELF 32-bit LSB relocatable, ARM....

There's some interdependency here as well; you have to use -O elf{32|64}-<arch> (not the generic elf{32|64}-{little|big}) output option to be able to make -B ... recognized.

See objcopy --info for the list of ELF formats / BFD types that your binutils can deal with.

Edit 15/Jul/2021: So I tried a little "use":

#include <stdio.h>

extern unsigned char _binary_binblob_start[];

int main(int argc, char **argv)
{
    for (int i = 0; i < 1024; i++) {
        printf("%02X ", _binary_binblob_start[i]);
        if ((i+1) % 60 == 0)
            printf("\n");
    }
return 0;
}

I can only make this link with the binblob if I make that "local arch". Else it gives the error @chen3feng points out below.

It appears it should be possible giving gcc linker options to pass, per https://stackoverflow.com/a/7779766/512360 - but if I try that verbatim, I get:

$ gcc use-binblob.c -Wl,-b -Wl,elf64-little binblob.o
/usr/bin/ld: skipping incompatible /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/libgcc.a when searching for -lgcc
/usr/bin/ld: cannot find -lgcc
/usr/bin/ld: skipping incompatible /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/../../../../lib64/libgcc_s.so.1 when searching for libgcc_s.so.1
/usr/bin/ld: skipping incompatible /lib/x86_64-linux-gnu/libgcc_s.so.1 when searching for libgcc_s.so.1
/usr/bin/ld: skipping incompatible /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 when searching for libgcc_s.so.1
/usr/bin/ld: skipping incompatible /lib/x86_64-linux-gnu/libgcc_s.so.1 when searching for libgcc_s.so.1
/usr/bin/ld: skipping incompatible /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 when searching for libgcc_s.so.1
/usr/bin/ld: skipping incompatible /usr/local/lib64/libgcc_s.so.1 when searching for libgcc_s.so.1
/usr/bin/ld: cannot find libgcc_s.so.1
/usr/bin/ld: skipping incompatible /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/libgcc.a when searching for -lgcc
/usr/bin/ld: cannot find -lgcc
collect2: error: ld returned 1 exit status

or, turning the args round,

$ gcc -Wl,-b -Wl,elf64-little binblob.o use-binblob.c
/usr/bin/ld: /tmp/cczASyDb.o: Relocations in generic ELF (EM: 62)
/usr/bin/ld: /tmp/cczASyDb.o: Relocations in generic ELF (EM: 62)
/usr/bin/ld: /tmp/cczASyDb.o: error adding symbols: file in wrong format
collect2: error: ld returned 1 exit status

and if I go "pure binary", this gives:

$ gcc use-binblob.c -Wl,-b -Wl,binary binblob
/usr/bin/ld: /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/libgcc.a:(.data+0x0): multiple definition of '_binary__usr_local_lib_gcc_x86_64_linux_gnu_10_2_0_libgcc_a_start'; /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/libgcc.a:(.data+0x0): first defined here
/usr/bin/ld: /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/libgcc.a:(.data+0x9445f6): multiple definition of '_binary__usr_local_lib_gcc_x86_64_linux_gnu_10_2_0_libgcc_a_end'; /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/libgcc.a:(.data+0x9445f6): first defined here
/usr/bin/ld: /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/../../../../lib64/libgcc_s.so:(.data+0x0): multiple definition of '_binary__usr_local_lib_gcc_x86_64_linux_gnu_10_2_0_____________lib64_libgcc_s_so_start'; /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/../../../../lib64/libgcc_s.so:(.data+0x0): first defined here
/usr/bin/ld: /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/../../../../lib64/libgcc_s.so:(.data+0x84): multiple definition of '_binary__usr_local_lib_gcc_x86_64_linux_gnu_10_2_0_____________lib64_libgcc_s_so_end'; /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/../../../../lib64/libgcc_s.so:(.data+0x84): first defined here
/usr/bin/ld: /lib/x86_64-linux-gnu/Scrt1.o: in function '_start': (.text+0x16): undefined reference to '__libc_csu_fini'
/usr/bin/ld: (.text+0x1d): undefined reference to '__libc_csu_init'
/usr/bin/ld: (.text+0x2a): undefined reference to '__libc_start_main'
/usr/bin/ld: /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/crtbeginS.o: in function 'deregister_tm_clones': crtstuff.c:(.text+0xa): undefined reference to '__TMC_END__'
/usr/bin/ld: /usr/local/lib/gcc/x86_64-linux-gnu/10.2.0/crtbeginS.o: in function 'register_tm_clones': crtstuff.c:(.text+0x3a): undefined reference to '__TMC_END__'
/usr/bin/ld: /tmp/ccF1Pxfc.o: in function `main': use-binblob.c:(.text+0x3a): undefined reference to 'printf'
/usr/bin/ld: use-binblob.c:(.text+0x6f): undefined reference to 'putchar'
/usr/bin/ld: a.out: hidden symbol '__TMC_END__' isn't defined
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status

The missing reference to _binary_binblob_start is expected from the latter alright, but the remainder are errors related to linking in libc and the basic runtime; I do not currently know how to resolve this. It should be possible via linker mapfiles, by declaring target (file-) specific options, but as of this writing I have not yet figured out how.

FrankH.
  • 17,675
  • 3
  • 44
  • 63
  • Looks like a good way to do it. I will try this out. For the version of the toolchain I am using, the output type is "elf32-littlearm" – Brian Jun 24 '13 at 11:44
  • 1
    If you're working on an embedded platform, you probably want your data to stay in flash instead of being loaded into ram. Add the following to fix it: --rename-section .data=.rodata – escrafford May 15 '14 at 17:14
  • 1
    arm-none-eabi-objcopy -I binary -O elf32-littlearm -B arm --rename-section .data=.rodata binblob binblob.o – escrafford May 15 '14 at 17:15
  • But without the `-B` option, linker will complain: ``` ../x86_64-pc-linux-gnu/bin/ld: unknown architecture of input file `binblob.o' is incompatible with i386:x86-64 output collect2: error: ld returned 1 exit status ``` any thing wrong? – chen3feng Jul 14 '21 at 04:20
  • @chen3feng I don't remember which toolchain I experimented with when I wrote the original answer; currently re-testing, it appears that both "normal" and "gold" GNU linkers absolutely want "compatible arch" to link. I do not require `-B ...` but do need to use the arch-specific object types (`elf64-x86-64` etc). By the answer on https://stackoverflow.com/questions/7757834/how-to-obtain-the-bfd-architecture-specification-for-the-current-platform it _appears_ that it should be possible; the example there errors for me in a test (unable to link with libc, for strange internal symbol errors) – FrankH. Jul 15 '21 at 07:29
5

Another approach might be to use xxd.

xxd -i your_data your_data.c

In the file you'll get two symbols unsigned char your_data[] and unsigned int your_data_len. First one will be a huge array containing your data, second one will be the lenght of that array.

Compilation of created C file might be time taking, so if you are using a build system / Makefile handle it properly avoiding unnecessary recompilations.

xxd should be part of vim (vim-common) package for your Linux distribution.

auselen
  • 27,577
  • 7
  • 73
  • 114
1

A quick way to do it would be to put the data in its own .c file (.c not .h) so that it becomes a .o by itself then in the linker script you can define a specific memory space and section entry for that .o file and put it wherever you want.

MEMORY
{
...
BOB : ORIGIN = 0x123400, length = 0x200
...
}
SECTIONS
{
...
TED : { mydata.o } > BOB
...
}
old_timer
  • 69,149
  • 8
  • 89
  • 168