4

I've created a procedure to selection sort a word vector but there's a problem: the sorting is totally wrong.

My vector: VET_2 DW 2, 7, 0, 1, 4, 8, 9, 3, 6, 5

; Selection Sort
SELECTION_SORT PROC
    ; AX = j & aux      CX = i
    ; BX = offset/min   DX = data and others
    PUSH 0                              ; to initialize i
    MOV SI, [OFFSET VET_2]
    ; ----- start for(int i = 0; i < n-1; i++) -----
    SLC_LOOP_FORA:                      ; outer loop
        CALL RESET_REGIST               ; reset some AX, BX, CX & DX
        CALL RESET_VAR                  ; used to reset AUX

        POP CX                          ; initialize i
        CMP CX, 18                      ; check if it's smaller than n-1 (20-2=18)
        JGE SLC_FIM                     ; if bigger, goes to the end 

        MOV BX, CX                      ; offset receive i, the position of the smaller
        ; ----- start j = i+1 -----
        MOV AX, CX                      ; AX = j.
        ADD AX, 2                       ; j = i+1
        ; ----- end j = i+1 -----

        ; ----- start for(int j = i+1; j < n; j++) -----
        SLC_LOOP_DENTRO:                ; inner loop
            MOV DX, [SI+BX]             ; move the smaller element to DX
            MOV BX, AX                  ; offset receives j

            CMP DX, [SI+BX]             ; compare if VET_2[min]<=VET_2[j]
            JL SLC_DENTRO_PULAR         ; if lesser, ignore the code below

            MOV BX, AX                  ; offset receive j, position of the smaller element

            SLC_DENTRO_PULAR:
                ADD AX, 2               ; inc 2 in j
                CMP AX, 20              ; compare j (ax) with n
            JL SLC_LOOP_DENTRO          ; if it's smaller, repeat inner loop
        ; ----- end for(int j = n+1; j < n; j++) -----

        CMP CX, BX                      ; compare i with the position of the smaller element
        JE SLC_LOOP_FORA                ; if equals, repeat outer loop, otherwise do the swap

        PUSH BX                         ; position of the smaller element
        PUSH [SI+BX]                    ; put vet[min] top of the stack

        ; ----- start aux = vet[i] -----
        MOV BX, CX                      ; offset (BX) receives i
        MOV DX, [SI+BX]                 ; DX receives vet_2[i]
        MOV AUX, DX                     ; AUX receives DX
        ; ----- end aux = vet[i] -----

        ; ----- start vet[i] = vet[min] -----
        POP AX                          ; AX receives the top of the stack (vet_2[min])
        MOV [SI+BX], AX                 ; vet_2[i] receives DX (smaller element)
        ; ----- end vet[i] = vet[min] -----

        ; ----- start vet[min] = aux -----
        POP BX                          ; offset (BX) receives the position of the smaller element from the stack
        MOV DX, AUX                     ; DX receives AUX
        MOV [SI+BX], DX                 ; vet_2[min] receives DX
        ; ----- end vet[min] = aux -----
        ADD CX, 2                       ; INC 2 on i
        PUSH CX                         ; put in the stack
        JMP SLC_LOOP_FORA               repeat outer loop
    ; ----- end for(int i = 0; i < n-1; i++) -----
    SLC_FIM:                            ; end the procedure
        RET
SELECTION_SORT ENDP

Before call selection sort procedure: 2 7 0 1 4 8 9 3 6 5

After call selection sort procedure: 5 2 7 0 1 4 8 9 3 6

Where is the error? Can somebody help me?

Fifoernik
  • 9,779
  • 1
  • 21
  • 27
  • 2
    `POP CX` inside a loop looks suspicious to me, because you `push 0` outside the loop, but there's a `JE SLC_LOOP_FORA` that can repeat that without pushing anything. So you're going to consume some stack space. `xor cx,cx` or `mov cx,0` would be a lot more sensible. If you run out of registers for locals, normally you store/reload them to locations like `[bp-4]` after making a stack frame. (You can't use `[SP+4]` or whatever in 16-bit code). push/pop inside loops is harder to reason about, leading to errors exactly like this. – Peter Cordes Nov 15 '18 at 16:54
  • 1
    Have you tried single-stepping through your code with a debugger to see if it leaves the loop early? And if so, you can see exactly which branch went wrong. – Peter Cordes Nov 15 '18 at 17:28

3 Answers3

2

Problem

When you don't need to swap 2 elements, you still need to raise the lower bound in CX. The cmp cx, bx je SLC_LOOP_FORA neglects this. Moreover it leaves you with an unbalanced stack (popping more than was pushed).

Solution:

The problem (included the stack un-balance) is easily corrected by introducing an extra label:

    PUSH 0                 ; to initialize i
    MOV SI, OFFSET VET_2
SLC_LOOP_FORA:             ; outer loop

    ...

    CMP CX, BX             ; compare i with the position of the smaller element
    JE NO_SWAP             ; if equals, repeat outer loop, otherwise do the swap

    ...

NO_SWAP:
    ADD CX, 2              ; INC 2 on i
    PUSH CX                ; put in the stack
    JMP SLC_LOOP_FORA      ; repeat outer loop

Consider

; AX = j & aux      CX = i
; BX = offset/min   DX = data and others
PUSH 0                              ; to initialize i
MOV SI, [OFFSET VET_2]

If you were willing to re-assign registers, you could enormously simplify this program.

With register assignments like

; AX = j & aux      DI = i
; SI = offset/min   DX = data and others
PUSH 0                              ; to initialize i
MOV BX, OFFSET VET_2

the swap code e.g. would become just these 3 instructions:

mov  ax, [bx+si]
xchg ax, [bx+di]
mov  [bx+si], ax
Sep Roland
  • 33,889
  • 7
  • 43
  • 76
1
MOV SI, [OFFSET VET_2]

I've never seen an assembler that would do it this way! The above sets the SI register equal to the contents of the first array element, so SI=2. Not really useful.

Acceptable instructions to get the address of your vector are:

mov si, offset VET_2

or

lea si, [VET_2]

or

lea si, VET_2
Fifoernik
  • 9,779
  • 1
  • 21
  • 27
  • 2
    According to Ross' answer on [Confusing brackets in MASM32](https://stackoverflow.com/a/25130189) `mov si, [OFFSET VET_2]` is still an immediate. MASM apparently only respects square brackets around numeric literals, or if you use `ds:[stuff]`. (Seems insane and terrible, and I wouldn't *recommend* ever using square brackets around an immediate except as intentional obfuscation, but if you're stuck using MASM then that's how it works.) – Peter Cordes Nov 17 '18 at 14:58
  • +1 for exposing this pure madness in MASM. Thanks to @PeterCordes for the link to Ross' answer. I had no idea... – Sep Roland Nov 25 '18 at 22:04
-3
selection_sort_i64:
    ; /*
    ; Input:
    ; RDI = long long * array
    ; RSI = length
    ;
    ; Pseudo-C code: 
    ;
    ; for (int i = 0; i < n - 1; ++i) {
    ;    min = i
    ;    for (int j = i + 1; j < n; ++j) {
    ;       if a[j] < a[min]; min = j
    ;   swap(i, min)
    ;*/

    I64_SIZE equ 8
    LG_I64_SIZE equ 3

    cmp rsi, 1
    jle .end            ; Just end it if length <= 1
    xchg rsi, rdi       ; rsi => left pointer
    mov rcx, rdi        ; rcx => length

    ; RDX will be the boundary for i: RSI + (N-1)*sizeof(int64)
    lea rdx, [rsi + rcx * LL_SIZE - LL_SIZE]

    ; /*
    ; Let's explain what's happening here.
    ; RSI will be &a[i], RDX will be it's right boundary
    ; RDI will be &a[j] (&a[i + 1]) and will loop n-i times
    ; RCX will be n-i and will be the counter for the inner loop
    ; RAX will track our minimum in the remaining of the array
    ; RBX will 
    ; */
.outer_loop:
    cmp rsi, rdx
    jge .end            ; while (i < n - 1) {
    mov rax, [rsi]      ;   min = a[i]
    mov rbx, rsi        ;   min_i = i   
    push rax

    mov rdi, rsi
    add rdi, I64_SIZE   ;   j = i + 1 // rdi => right pointer
    dec rcx             ;   // rcx = n - i, inner loop counter
    push rcx
.inner_loop:
        cmp rax, [rdi]              ; if (min > a[j])
        jle .min_less_or_equal
        mov rbx, rdi                ;   min_i = j
        mov rax, [rdi]              ;   min = a[j]
.min_less_or_equal:
        add rdi, I64_SIZE           ; j += 1
        loop .inner_loop
    pop rcx             ; // restore inner loop counter
    pop rax             ; // restore a[i]

    cmp rsi, rbx        ; // swap if minimum found
    jz .no_min          ;   if (i != min_i) 
    mov rdi, [rbx]      ;       temp = a[min]
    mov [rbx], rax      ;       a[min] = a[i]
    mov [rsi], rdi      ;       a[i] = temp
.no_min:
    add rsi, I64_SIZE   ;   i += 1
    jmp .outer_loop     ;   } // end outer loop
.end:
    ret

Hope this works. Thanks.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Atiq Israk
  • 29
  • 8
  • 2
    The OP is looking for help debugging *their* code, not for some random selection-sort implementation. This is an x86-16 question, but at least you didn't use any of r8..r15 or any addressing modes that aren't available in 16-bit mode, so you actually could port it to x86-64 directly just by changing the register sizes and `I64_SIZE`. – Peter Cordes Nov 16 '18 at 15:25
  • However, in x86-64 you *should* be using r8 or something for one of the loop counters, so you don't need to push/pop rcx inside the loop. The `loop` instruction is slow; only use it if optimizing for code-size over speed. [Why is the loop instruction slow? Couldn't Intel have implemented it efficiently?](/q/35742570) – Peter Cordes Nov 16 '18 at 15:28