12

I saw this question, but I didn't find my answer in it..

So, why would I prefer to use add esp, 4 or add esp, 8 instead of using pop one or two times? Is there any difference (performance, safety, etc.) at all or it's a matter of personal choice?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Kiril Kirov
  • 37,467
  • 22
  • 115
  • 187
  • 7
    It depends on whether you have a spare register for the `pop`. If not then use `add esp`. – Paul R May 05 '11 at 21:05
  • Haaa, that is a nice point, I haven't thought about that.. – Kiril Kirov May 05 '11 at 21:06
  • Related: [Why does this function push RAX to the stack as the first operation?](https://stackoverflow.com/q/37773787) - it can be more efficient to use a dummy `pop` into a dead register, on modern CPUs where load ports are usually not saturated, and where a stack-sync uop would probably be needed for an `add`. See my answer on that linked question. – Peter Cordes Jan 06 '22 at 18:24

3 Answers3

21

pop does add esp, 4 too, it simply saves whatever there is on the top of the stack in its operand before. If you need what's on the stack probably pop is faster than mov wherever, [esp]; add esp, 4 but if you simply need to clear the stack add esp, 4 will be fine.

BlackBear
  • 22,411
  • 10
  • 48
  • 86
2

Generally, the pop instruction is not equivalent to add esp, N.

pop is used to remove data from the stack and store it in some register; it's also agnostic to which direction the stack grows in, though that's usually not an issue.

Manually adding or subtracting from the stack pointer, esp, doesn't preserve the removed data in a register. It would most likely be more efficient, assuming you don't need to do anything with the data being removed from the stack.

Collin Dauphinee
  • 13,664
  • 1
  • 40
  • 71
1

pop loads data from memory (stack - pointed by ss:[esp]) to a general-purpose register, memory location, or segment register. Anyway, pop uses the processor's load unit to access the stack, while add esp does not use the load unit. Processors since Pentium Pro do Out-of-order execution i.e. execute as many instructions at a cycle as it has internal gates (units) available if these instructions can be executed simultaneously, and, if necessary, re-arranges the order of instructions to utilize the units fully.

Since most processors have just two load units, if you don't need data from the stack (i.e. you want to just skip the data), it is better to add esp, because it is a register-only operation and does not employ a load unit, thus your processor will be able to use the load unit for something else at that time.

Maxim Masiutin
  • 3,991
  • 4
  • 55
  • 72
  • 1
    `add esp` may require the CPU to use a stack-sync uop, if the previous use of ESP was a stack operation like `ret` or `push`, rather than an explicit reference like `mov ecx, esp`. If there's going to be any explicit use of `esp` after this pop or add operation, then a stack sync uop will be needed at some point anyway, otherwise you may be able to avoid one. See [Why does this function push RAX to the stack as the first operation?](https://stackoverflow.com/q/37773787) for why modern compilers use one dummy push or pop if that's all that's needed. – Peter Cordes Jan 06 '22 at 18:22