2

I know a bit about Assembly. So let me first introduce the codes, then explain my way of thinking.

#This is the Assembly version.
pushq   %rbp
movq    %rsp, %rbp
movl    $2, -4(%rbp)
movl    $3, -8(%rbp)
movl    $5, %eax
popq    %rbp
ret


#This is the C version.
int twothree() {
    int a = 2;
    int b = 3;

    return 2 + 3;
}

Alright, so the first thing that stares me is that we do not use the variables a and b as a + b. So they are unnecessary, we directly sum the integers. Yet, if computers were able to understand that, I guess it would be really scary. So, my question is that: How did this assembly code work without any addl or similar command? We directly move the Immediate (or constant) integer 5 to eax registrar.

Also, quick question. So what happens to the a and b variables after last two lines? Their position in stack (or maybe we can call the 'registrars' they used as a memory place) are free now as we use malloc + free. Is it true or at least logical? popq %rbp is the command for closing the stack I guess.

I am not an expert in Assembly as I said. So most of these thoughts are just thinking. Thanks!

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
ihatec
  • 57
  • 6
  • This is called "optimization", and is done on compiler level. It is able to understand that `2+3` is a constant expression, and it is not scary at all, because it is something other programmers have written it to do. – Eugene Sh. Apr 11 '22 at 17:00
  • 1
    The *compiler* added `2 + 3` and generated code with just the `5` in it (and the unused variables). It is allowed to do that, if the result is indistinguishable. – Weather Vane Apr 11 '22 at 17:00
  • Alright. So I exaggerated the situation. – ihatec Apr 11 '22 at 17:01
  • Modern compilers are indeed smart enough to figure out that the function always returns 5 and doesn't use the variables, so it will not bother making useless code. The C language standard says the compiler has to produce code that act *as If* it were doing exactly what you told it, but otherwise it's free to do it a different way. – Lee Daniel Crocker Apr 11 '22 at 17:01
  • What about my explanation about a and b? Stack is shrinks down, therefore they are gone basicly? – ihatec Apr 11 '22 at 17:04
  • `malloc/free` are _not_ involved because `a/b` are on the stack or in registers. If you had compiled with optimization the result would be two insts `movl $5,%eax ; ret`. The compiler would warn about a/b being initialized but unused – Craig Estey Apr 11 '22 at 17:07

2 Answers2

5

The compiler saw that you were adding two numbers 2 + 3. And the compiler calculated that 2+3=5 and it put 5 in the assembly code. This is called "constant folding".

I guess you have optimization turned off in your compiler since it didn't delete the useless variables a and b. But constant folding is very easy for the compiler (unlike other kinds of optimization) and useful, so it seems that the compiler is doing it even when you don't turn on optimization.

As you figured out, the assembly code does not add 2 and 3 because there is no addl or similar command. It just does return 5;

user253751
  • 57,427
  • 7
  • 48
  • 90
  • 1
    The main point of `-O0` "no optimization" is to give [consistent debugging, and compile quickly](https://stackoverflow.com/questions/53366394/why-does-clang-produce-inefficient-asm-with-o0-for-this-simple-floating-point). Not to make intentionally-slow code; that's only a common side-effect. Compile-time eval of constant *expressions* involving numeric literals is at least as cheap as manipulating the data structures necessary to emit asm instructions that do the calculation at run-time. – Peter Cordes Apr 11 '22 at 17:45
  • I guess you could say this is an optimization, but some compilers (notably GCC) do far more within a single expression even at `-O0` (which [doesn't literally mean "no optimization"](https://stackoverflow.com/questions/33278757/disable-all-optimization-options-in-gcc)), e.g. it will compile `int y = x / 10;` into a multiplicative inverse, instead of actually using a `div` instruction, because that's how it always compiles integer division by a constant. (except at `-Os`). Some compilers are more intentionally braindead in debug builds, e.g. `if (0 < 1)` might put constants in regs to compare. – Peter Cordes Apr 11 '22 at 17:47
2

There are no commands or codes in assembly programming. Instead, assembly code (uncountable) comprises instructions and directives.

Where do you see a use of malloc or free? These are functions for managing dynamic memory which isn't something your program uses. If either of these function were used, you'd have a call malloc or call free instruction in the code somewhere. The variables a and b are all in automatic storage, i.e. on the stack.

Now what happens in your code is that the compiler has performed constant folding to emit code as if you wrote

#This is the C version.
int twothree() {
    int a = 2;
    int b = 3;

    return 5;
}

This is something the compiler does regardless of optimisation flags. So indeed, no addition is happening at run time. It was already performed during constant folding at compile time.

Also, quick question. So what happens to the a and b variables after last two lines? Their position in stack (or maybe we can call the 'registrars' they used as a memory place) are free now as we use malloc + free. Is it true or at least logical? popq %rbp is the command for closing the stack I guess.

The variables were stored in the red zone, a 128 byte region of memory below the stack pointer which is free for use as a scratch without having to explicitly allocate it. Thus, no code is needed to allocate or release storage for them.

Now, there is no such thing as “closing the stack.” The stack is a region in memory. The top of the stack is pointed to by the stack pointer rsp whereas the base pointer rbp points to the bottom of the current stack frame. You'll often see code like

push %rbp
mov %rsp, %rbp
...
pop %rbp

to establish and tear down a stack frame for the function at its beginning and end. Read an assembly tutorial for more details.

zwol
  • 135,547
  • 38
  • 252
  • 361
fuz
  • 88,405
  • 25
  • 200
  • 352
  • The 'malloc+free' thing is just an intuitive thought. Thanks for helping. – ihatec Apr 11 '22 at 17:10
  • 1
    @amdryzen7000 Then say “allocated” and “released.” If you use the specific names of the C functions for dynamic memory management, then I'll have to assume that that is what you meant. It is very important to use the correct terms when asking questions, otherwise people will have a hard time understanding what you meant. Take your time to research the correct words first if you are unsure. – fuz Apr 11 '22 at 17:14
  • I was not unsure. I got your point, the missing part in my explanation is that I did not add "these comments of mine are intuitive, not directly pointing anything.". I will be more careful next time. Thank you for both answer + comments. Still new to the website. – ihatec Apr 11 '22 at 17:53