0

I am writing an assembly loop to get the max number in an array. It loops like this:

start_loop:

    # Get the current element in the array and move it to %rax
    # movz --> (1) b(yte-1), w(ord-2), l(long-4), q(uad-8)
    movzwq data_items(,%rdi,2), %rax

    # Check if the current element value is zero, if it is, jump to the end
    cmp $0, %rax
    jz exit

    # Increment the array index as we want to continue the loop at the end
    inc %rdi

    # Compare the current value (rax) to the current max (rbx)
    # WARNING: The `cmp` instruction is always backwards with ATT syntax!
    # It reads as, "With respect to %rbx, the value of %rax is...(greater|less) than"
    # So to see if a > b, do:
    #   cmp b, a
    #   jg
    # Reference: https://stackoverflow.com/a/26191257/12283168
    cmp %rbx, %rax
    jge update_value

    jmp start_loop


update_value:
    mov %rax, %rbx
    jmp start_loop


exit:
    mov $1, %rax           
    int $0x80

My question is this part of the comparison code here:

    jge update_value
    jmp start_loop

update_value:
    mov %rax, %rbx
    jmp start_loop # <== can I get rid of this part?

Is there a way to not have to specify the jmp start_loop in the update_value section? For example, in a high level language I could do:

while (1) {
    if (a > b)
        update_value();
    // continue
}

And not have to "jump back to while from the update_value function, I could just 'continue'. Is it possible to do something like this in assembly, or am I thinking about this incorrectly?

samuelbrody1249
  • 4,379
  • 1
  • 15
  • 58
  • What you _can_ do is replace `jge update_value` `jmp start_loop` with `jl start_loop`. – Michael Aug 28 '20 at 06:58
  • 1
    Yeah, use `jl start_loop` like a normal person to implement the `else continue`. "continue" is equivalent to jumping to the top of the loop. In general, never `jcc` over a `jmp` alone, just write the opposite condition to either jump where you want or fall through (e.g. out of a loop). Asm doesn't have scopes or structure, it's up to you to implement those concepts. See also [Why are loops always compiled into "do...while" style (tail jump)?](https://stackoverflow.com/q/47783926) – Peter Cordes Aug 28 '20 at 06:58
  • @PeterCordes can you please clarify what you mean by: "In general, never `jcc` over a `jmp` alone" ? – samuelbrody1249 Aug 28 '20 at 07:01
  • 1
    Look at `clang -O2` or `gcc -O2` output for examples of how to lay out loop / branch structure. Sometimes it's more complicated than a skilled human would make it, but you can learn a lot by understanding how their choices work. [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) – Peter Cordes Aug 28 '20 at 07:01
  • @PeterCordes it seems like they use `call` a lot too rather than `jmp`. – samuelbrody1249 Aug 28 '20 at 07:04
  • 1
    @samuelbrody1249: JCC is any conditional branch (https://www.felixcloutier.com/x86/jcc). Never lay out your code with `jcc x` / jmp somewhere` / `x:` so the only instruction being jumped over is a JMP. When execution enters these 2 instructions, it always either jumps somewhere or comes out the other side. But you can do the same thing with `jcc somewhere` with the opposite condition. Intel even has synonyms for negative conditions, like `jnge start_loop` is the same instruction as `jl start_loop`. – Peter Cordes Aug 28 '20 at 07:04
  • 1
    @samuelbrody1249: Yes, of course compilers use `call` for functions, if they don't inline them. Enable full optimization to get inlining. `jcc` over a `call` is the only good way to do a conditional *call*, but if you don't want to call/ret, instead just jump around between labels without leaving breadcrumbs (return addresses) on the stack, that code is either part of one big function. – Peter Cordes Aug 28 '20 at 07:08
  • (Making optimized tailcalls is also done with jmp or jcc, but a single loop and some ifs would normally be considered one function. Of course asm makes it possible to write true spaghetti code if you don't have some high-level structure in mind as you write, where the whole program is a mess of interlocking blocks.) – Peter Cordes Aug 28 '20 at 07:09
  • @PeterCordes so a better pattern would be to do: `cmp %rbx, %rax; jl start_loop; mov %rax, %rbx; jmp start_loop` ? – samuelbrody1249 Aug 28 '20 at 07:17
  • @Michael what about after the `jl` though? – samuelbrody1249 Aug 28 '20 at 07:18
  • Yes, except that the opposite of `jge` is `jl`. However, to implement your C logic of `if(a>b) update()`, you should have used `jg` over jmp (bad) or `jle` to the top (good). Oh, I just noticed you're writing `update_value()` *as a function* in C. Not as `max_seen = a` – Peter Cordes Aug 28 '20 at 07:18
  • 1
    Note that a label doesn't hold anything. It's just an annotation marking a location in your program. You cannot jump “into and out of a label,” because there is no “inside a label.” Make sure you don't have any misconceptions here! – fuz Aug 28 '20 at 08:16
  • 1
    Since your `update_value` is just a `mov`, you could replace everything between the `cmp` and the final `jmp start_loop` with a `cmovge`. But whether that would be more efficient depends on what your array contains (which in turn determines how well your conditional branch can be predicted). – Michael Aug 28 '20 at 08:34

0 Answers0