1

I was reading Jeff Duntemann's Assembly Language Step-by-Step, and I am confused about how some of the conditional jumps work. I understand that CMP is used to compare two values using subtraction and then throws away the result to just set the flags.

Is there any way to figuring out which flags need to be set/unset? I understand for JE and JNE it looks at whether the ZF is set, but I'm unsure about the other branching operations.

Here is the part I am stuck on:

ClearLine:
    pushad                  ; Save all caller’s GP registers
    mov edx,15              ; We’re going to go 16 pokes, counting from 0
    .poke:  mov eax,1   
    sub edx,1
    jae .poke               ; Loop back if EDX >= 0
    popad                   
    ret                     `

Why does JAE loop back if EDX >= 0? Wouldn't it loop back if EDX >= 1? After all, the SUB operation is just like the CMP operation but with the extra step of saving the result. So by basically saying CMP edx,1 aren't we saying "jump if first operand (EDX) is greater or equal to second operand (1)"? But when I test it in a debugger, it shows it looped 16 times, not 15. I don't understand why that is.

nanoman
  • 341
  • 4
  • 11
  • 2
    Yes, you are correct. You can just test it in a debugger. Also, `jae` is an unsigned comparison, so `>= 0` makes no sense, that's true for all unsigned numbers. – Jester Jul 23 '17 at 01:28
  • 2
    When edx=1, the `sub` will do: edx = 0, ZF=1, CF=0. So the `jae` will loop to `.poke`. When edx=0, the `sub` will result into edx=0xFFFFFFFF, ZF=0, CF=1, which is "jnae" or "jb", so the jae will be skipped. So the jae does jump back for edx values (ahead of sub) from 15 to 1 => 15 times => +1 non-jump (edx=0 ahead of sub) => the loop will be executed 16 times. – Ped7g Jul 23 '17 at 01:46
  • 3
    The statement makes sense if it refers to the value *after* the subtraction. If edx was 1 before the `sub`, it's gonna be 0 after, but the "comparison" uses the before value. Using C syntax that's `do { ... } while(edx-- >= 1)` (note the post-increment) – Jester Jul 23 '17 at 01:46
  • If still not sure... do `mov edx,0` => loop will be executed 1 time. `mov edx,1` => 2 times, etc... because it's `do { ... } while(...);` kind of loop, not `for(..) {}`, that would test `edx` first whether the loop should be executed at least the first time. – Ped7g Jul 23 '17 at 01:49
  • So when using a jump instruction, how do I know what it compares to what? – nanoman Jul 23 '17 at 01:49
  • It's "comparing" the original value before subtraction. Technically, the conditional jumps just look at flags, and @Ped7g has given you the example for that. – Jester Jul 23 '17 at 01:51
  • The jump doesn't compare anything with anything, it does check only the current state of flags, no matter how you achieved that state (there's plenty of instructions setting up flags without comparing or subtracting anything). But in case the last flag changing instruction was `cmp` or `sub`, it's "comparing" those arguments. In your case the `sub + jae` works as `do { eax = 1; } while (edx-- >= 1);` which is almost\* equivalent to `do { eax = 1; } while (--edx >= 0);` (\* but unsigned edx would make that infinite loop, as Jester points out). (I hope you understand C :) ) – Ped7g Jul 23 '17 at 01:51
  • Check also the x86 docs here, the conditional jumps should be quite complete and hopefully correct finally (after few rounds of fixes some time ago :) ): https://stackoverflow.com/documentation/x86/5808/control-flow/20470/conditional-jumps#t=201707230157386178477 – Ped7g Jul 23 '17 at 01:58

1 Answers1

2

Based on the phrasing in the question, it seems that at least part of your confusion stems from not properly separating the compare instruction from the conditional-jump instruction. The CMP first sets the flags, and then the conditional-jump branches according to the state of the flags. There are many different instructions that set flags (virtually all of the arithmetic and bitwise instructions set flags; see the documentation for each instruction for details), and none of these do any branching. In order to branch based on the flags, you need a Jcc instruction (where cc is the condition code, indicating the flags that it will check, such as ae, which means "above-or-equal-to").

The reason I point this out is because you're saying things like:

So by basically saying CMP edx,1 aren't we saying "jump if first operand (EDX) is greater or equal to second operand (1)"?

which is probably intended to just be a shortcut to describing what actually happens, but still—it is an incorrect mental model and will inevitably lead to confusion. The CMP instruction never does any jumping. All it does is set flags. You're correct that it sets the flags exactly like a subtraction (SUB) would, but the flags don't do anything until you execute a Jcc instruction that reads them and branches accordingly.

Although you already understand them, we'll start with JE/JZ and JNE/JNZ, because they're the easiest conditions to understand. These just look at the zero flag (ZF), and branch according to its state. JE is precisely equivalent to JZ. There are just two different mnemonics that programmers are allowed to choose between, based on what they think will make their code clearer and easier to read. For example, when you do a CMP, it usually makes sense to follow with JE, because logically, you're jumping if the two values were equal. Technically, you're actually jumping if the result of the subtraction was 0, because CMP sets flags like SUB, so this is why it's 100% equivalent to write JZ, you just won't see programmers do this as often. Conversely, when you do something like TEST reg, reg, you'll often see that followed by JZ, because it's more semantic to think of it as jumping if the result of the last operation was zero. Adding in the "not" to the condition has the obvious effect.

You can find a very helpful table of the conditional branching instructions here. I still find myself consulting this table or something much like it on a regular basis. As a beginner, the most useful thing there will be the textual description of the mnemonics. As a more advanced programmer, the most useful thing will become the mapping of the mnemonics to the actual flags that are checked. (The actual code bytes are quite handy sometimes, too.)

As you can see, JAE means "jump if above or equal", and that is determined by the status of the carry flag (CF). If carry is not set, then the branch will be taken. If carry is set, then execution will fall through. As the table also tells you, this is handy for unsigned comparisons. Why? Because that's what the carry flag is used for. I just wrote a lengthy answer explaining the carry and overflow flags here. It's a bit more detail than you need, but still contains relevant bits, like the definition of these flags.

You'll also see in that chart that there are multiple mnemonics for JAE, just like we saw with JE and JZ. The alternative mnemonics are JNB and JNC. The first one, JNB, is pretty obvious—that's just the converse of JAE. If a value is above or equal to another value, then it is also not below that value. JNC is just a more literal description of what flags the jump is based upon: the carry flag. Again, it doesn't technically matter which one you use, but it often makes your code semantically more correct and readable if you choose carefully.

With that conceptual understanding, let's look at your code in more detail:

    mov edx, 15
.poke:
    mov eax, 1
    sub edx, 1
    jae .poke

(I didn't like your formatting, so I rewrote it slightly. :-p) Obviously, this sets EDX to 15, and then enters the loop. Inside of the loop, it subtracts 1 from EDX and sets flags. Then, the following JAE instruction looks at the state of the flags and branches back to .poke (continuing the loop) if and only if the carry flag (CF) is not set.

Another way of thinking about this is that the loop continues if and only if the value in EDX is above-or-equal-to 1. Symbolically, that is just: EDX >= 1. Except, of course, that this symbolic expression doesn't properly signify that we are doing an unsigned comparison. As I mentioned in the other answer I linked above, the CPU doesn't know or care if values are signed or unsigned. That's something for the programmer to interpret. You use the same exact SUB (or CMP) instruction to do both signed and unsigned subtraction (comparison). What changes is which flags you look at afterwards. The carry flag (CF) is used for unsigned subtraction/comparison; the overflow flag (OF) is used for signed comparison/subtraction.

Let's walk through a couple of sample values of EDX to make sure we understand the logic.

The first time through the loop, when EDX is 15, the SUB instruction subtracts 1 from 15. The result is, of course, 14. Thus, the zero flag (ZF) is set to 0 (because the result is non-zero). The carry flag (CF) is set to 0 because there was no carry (no unsigned overflow). The overflow flag (OF) is set to 0 because there was no signed overflow. The sign flag (SF) is set to 0 because the result was unsigned (its sign flag, which is the most significant bit, was not set, meaning that the value is positive). Based on the status of CF, the JAE will branch back to .poke and continue the loop. Logically, you will keep looping because the value in EDX (15) was above-or-equal-to 1.

The same thing continues for some time. We'll let the loop spin, and then interrupt it when EDX is 1. Now, the SUB instruction subtracts 1 from 1. The result is 0. Thus, ZF is 1 (result is zero), OF is 0 (no signed overflow occurred), CF is 0 (no carry—i.e., unsigned overflow), and SF is 0 (result is unsigned). So, will the branch be taken? Yes, CF is 0. Logically, 1 is above-or-equal-to 1 (it is equal to, of course).

Next time, EDX is 0, so 1 will be subtracted from 0. The result is −1. ZF is 0 (result is non-zero), OF is 0 (no signed overflow occurred), CF is 1 (carry occurred—i.e., unsigned overflow), and SF is 1 (result is signed). The branch is not taken this time, because CF is 1. Logically, this makes sense, because 0 is not above-or-equal-to 1 (remember, this is an unsigned comparison).

This is why it looped 16 times in all. It looped when EDX was 15, and it continued looping up through EDX being 0. This is because your condition test was at the bottom of the loop. That is, in C notation:

do
{
    ...
}
while (edx-- >= 1);
nanoman
  • 341
  • 4
  • 11
Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574