-1

When I use the disas command in gdb with the following code:

int main(){
   char*a;
   size_t r;
   return 1;
}

I have this result :

0x080483db <+0>:    push   %ebp
0x080483dc <+1>:    mov    %esp,%ebp
0x080483de <+3>:    mov    $0x1,%eax
0x080483e3 <+8>:    pop    %ebp
0x080483e4 <+9>:    ret 

I don't understand why there are no instructions for char*a and size*t. How do I get the adress of a value and r value? Do they even exist?

Julien
  • 91
  • 8
  • 3
    What "instructions" do you expect? What do you think the code would be if the declarations aren't even used? What's "t"? – Dave Newton Feb 24 '19 at 00:46
  • Sorry, I meant "r". I was expecting something telling me that the esp pointer moves with "size_t r" because the size of "size_t" equals 4. – Julien Feb 24 '19 at 00:52
  • But they're not used or referenced in any way--no reason for the compiler to care about them. – Dave Newton Feb 24 '19 at 00:54
  • 2
    The compiler removed the references to `a` and `r` because they're unused. There's no point in keeping them around. – Ken White Feb 24 '19 at 00:55
  • pointers, ints, chars, etc are a high level language concept, they dont have real meaning at lower levels. bits is bits. in your specific case, you optimized out the dead code then are wondering where the dead code went. – old_timer Feb 24 '19 at 01:13
  • @old_timer I understand the real point, but data types may have meaning depending on the target architecture. In this case, not so much since it's trivially optimized away, but types and sizes can matter :) – Dave Newton Feb 24 '19 at 01:17
  • only when the bits are used during execution, then they go back to being bits. add an offset to a pointer its not a pointer its some bits being added together, two operands, indistinguishable from ints being added. – old_timer Feb 24 '19 at 01:19
  • the compiler ceratinly implements the desired functionality described by the high level language. using the low level language – old_timer Feb 24 '19 at 01:19

3 Answers3

4

The declarations char*a; and size_t r; don't do anything by themselves; they rather tell the compiler that you want to be able to use the identifiers a and r for storage of values with some lifetime limited to the duration of main's execution. On the other hand, most assembly instructions (except nops and such) do something.

If you stored and accessed values in these variables, or took their addresses and used those addresses, in a way that's not trivially equivalent to doing-nothing with them, then you would see the compiler emit code to make room (typically by adjusting the stack pointer, or pushing some registers to the stack to save their values so that there are extra free registers for your data) and to store/load the values.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • It's slightly surprising that storage for the variables was optimized away even with optimization disabled. I would have expected you could stop at a breakpoint (on the return statement) and [modify the variables in a debugger](//stackoverflow.com/q/53366394). Compiling with `-O0` like the OP did (I'm guessing from the fact it uses `EBP` as a frame pointer, but `-fomit-frame-pointer` is the default for all modern compilers) treats all variables somewhat like `volatile`. But I guess they don't become variables until initialized / used. – Peter Cordes Feb 24 '19 at 16:01
  • Thank you a lot, it was very helpful ! – Julien Feb 24 '19 at 18:11
0

You need to do your experiments such that the dead code is not optimized out.

unsigned int fun0 ( void )
{
    return(0x12345678);
}
char * fun1 ( void )
{
    char *x;
    x = (char *)0x12345678;
    return(x);
}
unsigned int fun2 ( unsigned int x )
{
    return(x+12);
}
unsigned int * fun3 ( unsigned int *x )
{
    return(x+3);
}

giving something like this

Disassembly of section .text:

00000000 <fun0>:
   0:   e59f0000    ldr r0, [pc]    ; 8 <fun0+0x8>
   4:   e12fff1e    bx  lr
   8:   12345678    eorsne  r5, r4, #120, 12    ; 0x7800000

0000000c <fun1>:
   c:   e59f0000    ldr r0, [pc]    ; 14 <fun1+0x8>
  10:   e12fff1e    bx  lr
  14:   12345678    eorsne  r5, r4, #120, 12    ; 0x7800000

00000018 <fun2>:
  18:   e280000c    add r0, r0, #12
  1c:   e12fff1e    bx  lr

00000020 <fun3>:
  20:   e280000c    add r0, r0, #12
  24:   e12fff1e    bx  lr
old_timer
  • 69,149
  • 8
  • 89
  • 168
0

The thing to note here is that the C language has the as-if rule which says that the compiled program need only produce the same observable behaviour.

Since the observable behaviour of your program is equivalent to that of

int main(){
    return 1;
}

that is what the compiled code does.

This does not apply to declarations alone and can be arbitrarily complex. For example the common hello world program:

#include <stdio.h>
int main(){
    printf("Hello world!\n");
}

has the observable behaviour equivalent to

#include <stdio.h>
int main(){
    puts("Hello world!");
}

the latter program is the result you get if you compile the former with -O3:

.LC0:
    .string "Hello world!"
main:
    leaq    .LC0(%rip), %rdi
    subq    $8, %rsp
    call    puts@PLT
    xorl    %eax, %eax
    addq    $8, %rsp
    ret