1

I am trying to follow the exercise in the book PC Assembly by Paul Carter. http://pacman128.github.io/pcasm/

I'm trying to run the program from 1.4 page 23 on Ubuntu 18. The files are all available on the github site above.

Since original code is for 32bit I compile using

nasm -f elf32

for first.asm and asm_io.asm to get the object files. I also compile driver.c

I use the linker from gcc and run

gcc -m32 -o first first.o asm_io.o driver.o 

but it keeps giving me a bun of errors like

undefined reference to '_scanf' undefined reference to '_printf'

(note _printf appears instead of printf because some conversion is done in the file asm_io.asm to maintain compatibility between windows and linux OS's)

I don't know why these errors are appearing. I also try running using linker directly

ld -m elf_i386 -e main -o first -first.o driver.o asm_io.o -I /lib/i386-linux-gnu/ld-linux.so.2 

and many variations since it seems that its not linking with the C libraries.

Any help? Stuck on this for a while and couldn't find a solution on similar questions

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
ackbar03
  • 77
  • 1
  • 8

1 Answers1

5

Linux doesn't prepend _ to names when mapping from C to asm symbol names in ELF object files1.

So call printf, not _printf, because there is no _printf in libc.

Whatever "compatibility" code did that is doing it wrong. Only Windows and OS X use _printf, Linux uses printf.

So either you've misconfigured something or defined the wrong setting, or it requires updating / porting to Linux.


Footnote 1: In ancient history (like over 20 years ago), Linux with the a.out file format did use leading underscores on symbol names.


Update: the library uses the NASM preprocessor to %define _scanf scanf and so on, but it requires you to manually define ELF_TYPE by assembling with nasm -d ELF_TYPE.

They could have detected ELF32 or ELF64 output formats on their own, because NASM pre-defines __OUTPUT_FORMAT__. Someone should submit a pull-request to make this detection automatic with code something like this:

%ifidn __OUTPUT_FORMAT__, elf32
  %define  ELF_TYPE 32
%elifidn __OUTPUT_FORMAT__, elf64
  %define  ELF_TYPE 64
%endif


%ifdef ELF_TYPE
...
%endif
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    I was just gonna point out that in the asm_io.asm file the author included (includes some functions used for debugging) it has a section %ifdef ELF_TYPE %define _scanf scanf %define _printf printf but then I took a closer look and in the comments it said to compile with nasm -f elf -d ELF_TYPE asm_io.asm and uh... yea that solved the problem lol, stupid me. So i guess thanks haha, and for the very prompt answer. I'll put this as the right answer – ackbar03 Nov 10 '18 at 10:00
  • @ackbar03: They could have used `%ifidn __OUTPUT_FORMAT__, elf32` (and also check for `elf64`) to avoid you having to manually do anything on the NASM command line. Someone should send in a pull request. [How to detect architecture in NASM at compile time to have one source code for both x64 and x86?](https://stackoverflow.com/a/29891660) – Peter Cordes Nov 10 '18 at 10:06
  • Well, probably there's no `_printf` code in the C library, but just to cope with legacy code, probably it should be :) Let's vote for inclussion!!! :) – Luis Colorado Nov 15 '18 at 09:15
  • @LuisColorado: I don't think it makes any sense to double the size of the dynamic symbol table (slowing down dynamic linking on every process startup) by having a legacy alternate name for every symbol in libc and libm. (And where do you stop?). It wouldn't even be possible: `_exit(2)` and `exit(3)` are separate functions. See [Syscall implementation of exit()](https://stackoverflow.com/q/46903180) – Peter Cordes Nov 15 '18 at 11:13
  • @PeterCordes, almost completely in agreement, both problems you describe can be solved, but I agree with you that making a specific implementation to be "portable" is some kind of contradiction. Implementations have their own particularities. The GNU approach of taking off the front `_` in identifiers allows to access the, _until then,_ hidden identifiers without it from the high level C. The same criteria (or reasoning) to introduce it can be used to eliminate. – Luis Colorado Nov 21 '18 at 06:28
  • @LuisColorado: interesting point. I'd never considered why Linux/ELF changed to not prefix with `_`. While commenting the other day, I wondered if it was just for minor file-size reasons (1 fewer byte in every symbol). Maybe also better behaviour for string comparisons if a mismatch can sometimes be detected in the first byte instead of 2nd? Or better properties for a radix trie or other data structure that holds a set of identifiers. You can still create symbol names you can't use as C variables, e.g. containing `.`. Or use local labels that don't appear in the object file at all. – Peter Cordes Nov 21 '18 at 06:45