10

to understand the concept of relocation, i wrote a simple chk.c program as follows :

  1 #include<stdio.h>
  2 main(){
  3         int x,y,sum;
  4         x = 3;
  5         y = 4;
  6         sum = x + y;
  7         printf("sum = %d\n",sum);
  8 }

its equivalent assembly code, using "objdump -d chk.o" is :

00000000 <main>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 e4 f0                and    $0xfffffff0,%esp
   6:   83 ec 20                sub    $0x20,%esp
   9:   c7 44 24 1c 03 00 00    movl   $0x3,0x1c(%esp)
  10:   00 
  11:   c7 44 24 18 04 00 00    movl   $0x4,0x18(%esp)
  18:   00 
  19:   8b 44 24 18             mov    0x18(%esp),%eax
  1d:   8b 54 24 1c             mov    0x1c(%esp),%edx
  21:   8d 04 02                lea    (%edx,%eax,1),%eax
  24:   89 44 24 14             mov    %eax,0x14(%esp)
  28:   b8 00 00 00 00          mov    $0x0,%eax
  2d:   8b 54 24 14             mov    0x14(%esp),%edx
  31:   89 54 24 04             mov    %edx,0x4(%esp)
  35:   89 04 24                mov    %eax,(%esp)
  38:   e8 fc ff ff ff          call   39 <main+0x39>
  3d:   c9                      leave  
  3e:   c3                      ret    

and .rel.text section seen using readelf is as follows :

Relocation section '.rel.text' at offset 0x360 contains 2 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000029  00000501 R_386_32          00000000   .rodata
00000039  00000902 R_386_PC32        00000000   printf

i have following questions based on this :

1) from 2nd entry in .rel.text section, i am able to understand that value at offset 0x39 in .text section (which is 0xfcffffff here) has to be replaced with address of a symbol associated with index 9 of symbol table (& that comes out to be printf). But i am not able to clearly understand the meaning of 0x02 (ELF32_R_TYPE) here. What does R_386_PC32 specify here ? Can anyone please explain its meaning clearly.

2) i am also not able to understand the 1st entry. what needs to be replaced at offset of 0x29 in .text section and why is not clear here. Again i want to know the meaning of R_386_32 here. i found one pdf elf_format.pdf, but i am not able to clearly understand the meaning of "Type" in .rel.text section from that.

3) Also i want to know the meaning of assembly inst "lea (%edx,%eax,1),%eax". Though i found a very good link (What's the purpose of the LEA instruction?) describing the meaning of lea, but the format of lea (what are 3 arg's inside brackets) is not clear.

if anyone can clearly explain the answers of above questions, it will be greatly appreciated. i am still struggling to find the answers to these questions,though i have tried a lot with google.

one more question. i have shown the symbol table entries for both offset 5 and 9 below.

 Num: Value Size Type Bind Vis Ndx Name 
 5: 00000000 0 SECTION LOCAL DEFAULT 5 
 9: 00000000 0 NOTYPE GLOBAL DEFAULT UND printf' 

The info field for the first entry in .rel.text table is 0x05 which indicates the index of symbol table. I have shown the symbol table entry for index 5 above, but not able to understand how that tells us that it is for .rodata .

Community
  • 1
  • 1
mezda
  • 3,537
  • 6
  • 30
  • 37

2 Answers2

14

1), 2): R_386_32 is a relocation that places the absolute 32-bit address of the symbol into the specified memory location. R_386_PC32 is a relocation that places the PC-relative 32-bit address of the symbol into the specified memory location. R_386_32 is useful for static data, as shown here, since the compiler just loads the relocated symbol address into some register and then treats it as a pointer. R_386_PC32 is useful for function references since it can be used as an immediate argument to call. See elf_machdep.c for an example of how the relocations are processed.

3) lea (%edx,%eax,1),%eax means simply %eax = %edx + 1*%eax if expressed in C syntax. Here, it's basically being used as a substitute for the add opcode.

EDIT: Here's an example.

Suppose your code gets loaded into memory starting at 0x401000, that the string "sum = %d\n" ends up at 0x401800 (at the start of the .rodata section), and that printf is at 0x1400ab80, in libc.

Then, the R_386_32 relocation at 0x29 will place the bytes 00 18 40 00 at 0x401029 (simply copying the absolute address of the symbol), making the instruction at 0x401028

  401028:   b8 00 18 40 00          mov    $0x401800,%eax

The R_386_PC32 relocation at 0x39 places the bytes 43 9b c0 13 at 0x401039 (the value 0x1400ab80 - 0x40103d = 0x13c09b43 in hex), making that instruction

  401038:   e8 43 9b c0 13          call   $0x1400ab80 <printf>

We subtract 0x40103d to account for the value of %pc (which is the address of the instruction after call).

nneonneo
  • 171,345
  • 36
  • 312
  • 383
  • Thanks for the reply. can you please explain the meaning of R_386_32 & R_386_PC32 using some example or may be the example i have taken here. one more doubt : 'Num: Value Size Type Bind Vis Ndx Name 5: 00000000 0 SECTION LOCAL DEFAULT 5 9: 00000000 0 NOTYPE GLOBAL DEFAULT UND printf' Since the info field for the first entry in .rel.text table is 0x05 which indicates the index of symbol table, how that tells us that it is for .rodata . i have shown the symbol table entries for both offset 5 and 9 above. thanks again. – mezda Sep 13 '12 at 19:09
  • can you please provide answer for one more question i updated above. it was not properly formatted in my above comment. – mezda Sep 14 '12 at 12:58
  • Please, don't ask multipart questions like that. It isn't fair, either, to continue adding questions and demand that they be answered. Open a new question if you must ask. – nneonneo Sep 15 '12 at 02:58
  • 3
    Just a small addition to nneonneo's nice answer: According to the ELF specification, the relocation value in R_386_PC32 mode is computed as S + A - P, where S is the symbol value, A is the value currently present at the address where the relocation takes place, and where P is the relocation address. Here, S - P the jumping offset from the point 0x401029 of relocation to the function entry point, and the addition of A=0xfcffffff causes a decrement by 4 bytes, accounting for the fact that the call instruction expects the offset measured from the next instruction. – Hanno Jun 23 '15 at 17:05
5

The first relocation entry is to get the pointer to your format string ("sum = ...") in the process of setting up the call to printf. Since the .rodata section is relocated as well as the .text section, references to strings and other constant data will need fixups.

With that in mind, it would appear that the R_386_32 relocations deal with data, and the R_386_PC32 with code addresses, but the ELF spec (of which I do not have a handy copy) probably explains the various details.

The lea instruction is what the compiler chose to perform the addition for this routine. It could have chosen add or a couple other possibilities, but that form of lea seems to be used quite often for certain cases, because it can combine an addition with a multiplication. The result of the instruction is lea (%edx,%eax,1),%eax is that %eax will get the value of %edx + 1 * %eax. The 1 can be replaced with a restricted set of small integers. The original purpose of the lea instruction was "Load Effective Address" - to take a base pointer, an index, and a size and yield the address of an element in an array. But, as you can see, compilers can choose to use it for other things, as well...

twalberg
  • 59,951
  • 11
  • 89
  • 84
  • thanks for the ans. i have one more question described above. can you provide the answer for that also. thanks again. – mezda Sep 14 '12 at 13:49