2

One of the challenge questions in programming from the ground up is "to modify the program to use an ending address rather than the number 0 to know when to stop."

I am finding it difficult to do this since up to this point the book has only introduced movl, cmpl, incl (along with the addressing modes) and jmp instructions. Basically everything in the code snippet below is what has been introduced so far. All solutions I have found involve instructions not yet introduced in this book. The code below finds the maximum value from the set.

.section .data
data_items:             #These are the data items
.long 3,67,34,222,45,75,54,34,44,33,22,11,66,0

.section .text
.globl _start
_start:
    movl $0, %edi                   # move 0 into the index register
    movl data_items(,%edi,4), %eax  # load the first byte of data
    movl %eax, %ebx                 # since this is the first item, %eax is
                                    # the biggest
start_loop:                     # start loop
    cmpl $0, %eax                   # check to see if we’ve hit the end
    je loop_exit
    incl %edi                       # load next value
    movl data_items(,%edi,4), %eax
    cmpl %ebx, %eax                 # compare values
    jle start_loop                  # jump to loop beginning if the new
                                    # one isn’t bigger
    movl %eax, %ebx                 # move the value as the largest
    jmp start_loop                  # jump to loop beginning
loop_exit:
    # %ebx is the status code for the exit system call
    # and it already has the maximum number
    movl $1, %eax   #1 is the exit() syscall
    int $0x80

Note this question is distinctly different from the subsequent question which asks to modify the program to use a length count rather than the number 0. To me it seems like the address of the last number in the array should be stored in a register and then compared to the address of the pointer. I can't figure out a way to do this that fits the progression of this book since the book has only introduced the bare bones thus far.

Sep Roland
  • 33,889
  • 7
  • 43
  • 76
das420s
  • 47
  • 4
  • Has it not introduced LEA, which would make it easy to compute a pointer to the one-past-end element? [Why are loops always compiled into "do...while" style (tail jump)?](https://stackoverflow.com/q/47783926) uses examples where the loop condition is `while(ptr < end_ptr)`. – Peter Cordes Jul 21 '21 at 04:24
  • Would be a duplicate of [Using ending address to stop a loop](https://stackoverflow.com/q/61506812) (which at least has a decently-formatted version of the source, and states that it's from the PGU book). Except that Q&A doesn't have a good answer, just a hard-coded length done in a buggy way. – Peter Cordes Jul 21 '21 at 04:34
  • he has not introduced lea into the text yet most of the solutions i found before posting this question involved that. the subsequent question is "modify the program to use a length count rather than the number 0 to know when to stop". that seems more similar to the link you provided. – das420s Jul 21 '21 at 04:36
  • 1
    Probably he wants you to stick another label at the end of the array, and `cmp $data_end, %edi` / `jne .loop` as the loop condition. (With EDI as a pointer instead of index.) – Peter Cordes Jul 21 '21 at 04:38
  • okay that second link seems like it answers my question at first glance thank you – das420s Jul 21 '21 at 04:38
  • It's *an* answer, but it's not a good answer (buggy, see my comments there) and doesn't use a pointer, so like I said, no, don't do that. Otherwise I would have upvoted that answer and closed this as a duplicate. What you could do is copy the code-formatting (with indenting) from that question so yours is more readable. – Peter Cordes Jul 21 '21 at 04:39
  • [Assembly Language (x86): How to create a loop to calculate Fibonacci sequence](https://stackoverflow.com/q/32659715) shows loops using an end-pointer, and calculating an end-pointer with LEA. Still looking for one using a label at the end. [Understanding GCC inline assembly code that uses memory source operands and x87 fcomi / fcmov](https://stackoverflow.com/q/57134540) shows AT&T-syntax (what you're using) compiler output for a loop like that, but not the setup. – Peter Cordes Jul 21 '21 at 04:43
  • you are right it is buggy, but it seems like it is not really using the ending address. in principle is this any different from doing something like ```cmpl %edi, $12``` – das420s Jul 21 '21 at 04:44
  • Related: [How to check an "array's length" in Assembly Language (ASM),](https://stackoverflow.com/a/52866028) for assemble-time calculation of array size. – Peter Cordes Jul 21 '21 at 04:49
  • Also related: [Copying to arrays in NASM](https://stackoverflow.com/a/56411779) shows using labels and sizes, with loops using an end-pointer. – Peter Cordes Jul 21 '21 at 05:01
  • your edit is wrong this code finds the maximum value the code that i posted originally found the minimum value. – das420s Jul 21 '21 at 05:21
  • it is also worth it to mention the differences between this question and the follow up question. – das420s Jul 21 '21 at 05:33
  • The first revision of your question (https://stackoverflow.com/revisions/68463932/1) had the same condition for updating %ebx, and calls it the biggest. Furthermore, it *is* finding the max, isn't it? It skips over `mov %eax, %ebx` if eax <= ebx. (Remember that AT&T syntax has the operands backwards, so the semantic meaning of conditions like `jle` is also backwards vs. how they were designed for Intel syntax, where `cmp x,y` / `jle` would be taken if x <= y. – Peter Cordes Jul 21 '21 at 05:41
  • Also, I tested just to make sure I didn't overlook something (`gcc -m32 -nostdlib -static bar.S` ; `./a.out ; echo $?`. Both versions exit with status `222`, so the comments are correct in both the first and the current versions. IDK if there was possibly another version in the middle that did min that you were talking about. – Peter Cordes Jul 21 '21 at 05:48
  • one of the exercises in the book involved changing the code to compute a minimum value i had to move the ```cmp $0, %eax``` lower before changing the jle to a jge. I thought that was the original code that i had posted but it looks like i just copy pasted straight from the book and said it finds the "minimum". your right of course, both codes find the maximum. i think when you reverted it it went back to minimum. – das420s Jul 21 '21 at 06:00
  • Yeah, I accidentally reverted your text changes with one of my edits; I think I fixed that with my last edit. – Peter Cordes Jul 21 '21 at 06:25

1 Answers1

5

You can do this with only mov and cmp, no lea required to calculate an end-pointer. (You don't have a length anywhere anyway to use with LEA).

You should add a new label at the end of the array, so you can refer to that position in memory (aka address). And remove the terminating 0 from the array because we're using addresses instead of a sentinel value.

.section .data
data_items:
  .long 3,67,34,222,45,75,54,34,44,33,22,11,66     # ,0   remove the sentinel / terminator
data_items_end:                                  # and add this new label

You don't need that address in a register; you can use cmp $data_items_end, %reg to use it as an immediate, with the linker filling in the right bytes into the machine code just like it does for your mov data_items(,%edi,4), %eax. (cmp symbol, %reg would compare with memory at that address. $symbol is the address as an immediate, in AT&T syntax.)

What you do need in a register is the start address, so you can increment and deref it. (For a function takes a pointer+length, you could compute the end address in a register.)

_start:
    mov  $data_items, %edi       # int *ptr = &data_items[0]
    mov  (%edi), %ebx            # current max
   # setting %eax is unnecessary here, it's always written before being read in this and the original version
loop_start:
    add  $4, %edi                # ptr++  (4 byte elements)
    cmp  $data_items_end, %edi
    je   loop_exit               # if (ptr == endp) break
    ...                  # compare with (%edi) and update %ebx if greater.
    jmp  loop_start
  ...

More efficient would be a do{}while loop structure like compilers use, especially since you know the array contains more than 1 element so you don't need to check for the case where the loop body should run 0 times. Notice that there's no unconditional jmp that has to execute every time, in addition to a cmp/jcc.

_start:
    mov  $data_items, %edi       # int *ptr = &data_items[0]
    mov  (%edi), %ebx            # current max

loop_start:                    # do{
    add  $4, %edi                # ptr++;  (4 byte elements)
  ## maybe update max:
    mov  (%edi), %eax            # tmp = *ptr;
    cmp  %ebx, %eax
    cmovg %eax, %ebx             # max = (tmp > max) ? tmp : max;
  ## end of loop body

    cmp  $data_items_end, %edi
    jne  loop_start            # }while(ptr != endp)
## end of loop, but nothing jumps here so no label is needed.

    mov  $1, %eax
    int  $0x80             # SYS_exit(%ebx)

I used cmp/cmovg (conditional move) instead of branching just because it's fewer instructions to type and no branching within the loop, making it easier to see the loop structure.


Other examples of looping and pointers:

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847