0

I am trying to learn how ELF files are structured and probably how to make one manually.

I am working on aarch64 Linux OS, the ELF files I am inspecting are of elf64-littleaarch64 format.

Also I try to learn by myself, however I got stuck with some questions...

  1. When I do xxd code, the first number in each line of the output specifies the address of bytes in the file. But when objdump -D code, the first number is something like 4000b0, however corresponds to 000000b0 in xxd. Why is there a four at the beginning?
  2. In objdump, the bytecode is for example 11000a94, which 'means' add w20, w20, #2 in assembly. I know, that 11 is the opcode, but what does 000a94 mean? I thought, it should be the parameters, but I am adding the value 2 and can't find the number 2 in it.

If you have a good article to read, or can help me explain this, I will be very grateful!

dev_null
  • 71
  • 6

2 Answers2

1
  1. xxd shows the offset of the bytes within the file on disk. objdump -D shows (tentatively) the address in memory where those bytes will be loaded when the program is run. It is common for them to differ by a round number. In particular, 0x400000 may correspond to one higher-level page table entry; see Why Linux/gnu linker chose address 0x400000? which is for x86-64 but I think ARM64 is similar (haven't checked). It doesn't have anything to do with the fact that 0x40 is ASCII @; that's just a coincidence.

    Note that if ASLR is in use, the actual memory address will be randomly chosen every time the program is run, and will not match what objdump shows you, though the difference will still be a multiple of the page size.

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
0

Well, I was too fast asking this question, but now, I will answer it too.

  1. 40 at the beginning of the addresses in objdump is the hex representation of the char "@", which means "at" and points to an address, very simple!
  2. Little Endian has CPU addresses stored in 5 bits instead of 6 or 8. That means, that I should look for the binary value of the objdump code: 11000a94 --> 10001000000000000101010010100, where it can be divided into [10001][00000000000010][10100][10100] with [opcode][value][first address][second address]

Both answers are wrong, see the accepted answer. I will still let them here, though

dev_null
  • 71
  • 6
  • 1
    I'm afraid #1 is not right, see my answer. – Nate Eldredge May 01 '21 at 22:57
  • 1
    For #2, these 5-bit numbers are *register* numbers. You wouldn't normally call them "CPU addresses" because that is too easy to confuse with *memory* addresses, which these are not. It also has nothing to do with big or little endian. Full details about ARM64 instruction encoding is in the ARMv8 Architecture Reference Manual, C4.1. – Nate Eldredge May 01 '21 at 22:58
  • Yup, since I am not learning from somewhere, but myself, I don't really know, how to call things right, thank you for your answer – dev_null May 01 '21 at 23:13