Keeping track of two different array indices in insertion sort with pointer increments vs. indexed addressing modes?

Question

So to teach myself loops and conditionals, I was trying to implement the Insertion Sort algorithm from Wikipedia:

i ← 1
while i < length(A)
    j ← i
    while j > 0 and A[j-1] > A[j]
        swap A[j] and A[j-1]
        j ← j - 1
    end while
    i ← i + 1
end while

I think I have translated the code to assembly well, yet it does not function. I think the reason for the problem probably is with indexing (specifically the part A[j or j - something]), and yet I am unable to fix it. Here is my code:

.386 
.model flat,stdcall
.stack 4096

ExitProcess proto,dwExitCode:dword

.data
arr DWORD 4,5,1,7


.code
main proc
mov esi, OFFSET arr
mov eax, 1 ; eax is i, i <- 1

L2:

    mov ebx, eax ; ebx is j, j <- i

    ; while j > 0
    INNERWHILE:
    cmp ebx, 0
    jle L3  

    ; and A[j-1] > A[j]
    mov ecx, [esi + 4*ebx] ; A[j]
    sub esi, 4
    cmp [esi + 4*ebx], ecx ;A[j-1] > A[j]
    jle L3

    ; swap A[j] and A[j-1]
    mov edx, [esi + 4*ebx] ; edx = A[j-1]
    add esi, 4
    mov ecx, [esi + 4*ebx] ; ecx = A[j]

    ; Begin Swap
    mov [esi + 4*ebx], edx ; A[j-1] <- A[j]
    sub esi, 4
    mov [esi + 4*ebx], ecx  ; A[j] <-  A[j-1]
    dec ebx
    jmp INNERWHILE
L3:
inc eax
cmp eax, LENGTHOF arr
jl L2
invoke ExitProcess,0
main ENDP
END main

Any help is appreciated, my level is a beginner in x86 so any tips are welcome.

This is the point where I am confused: High level language assume array starts from 0 and moves on. However A[j] != [esi] as j may not be = 0, therefore, I am totally stuck in the index based addressing part.

`sbb esi, 4` doesn't make sense. You don't want to subtract an extra 1 if CF was set. You probably mean `sub`, not subtract-with-borrow? Also, why modify the array base as well as using indexed addressing modes? — Peter Cordes, Jul 06 '18 at 00:17
@PeterCordes I meant `SUB` sorry. This is the point where I am confused. High level language assumes array starts from 0 and moves on. However A[j] != [esi] as j may not be = 0, therefore, I am totally stuck in the index based addressing part. — Jishan, Jul 06 '18 at 00:24

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

It looks like you're making things difficult for yourself trying to use esi as A+i and ebx as j.

The easy thing it to use three different registers total for A, i, and j, instead of trying to optimize away A+i into a single register.

So A[i] is [esi + edi*4] and A[j] is [esi + ebx*4].

That would be the direct translation of that pseudo-code into asm.

There are optimizations you could do later, once you get that working, like maybe optimize A+i and A+j into registers so you can use [reg] addressing modes instead of [reg+idx*4] addressing modes.

And you can maybe not keep A in a register at all, and only use it as a memory source operand for j>0, instead doing cmp edx, OFFSET arr, or cmp edx, [esp+0] if you pretend the base address isn't a compile time constant and push it on the stack.

And then j = i becomes a mov edx, esi.

You might want to translate the pseudocode to C and see what optimizing compilers do. (Write a sort function that takes a pointer as a function arg, so the compiler can't do constant propagation of a constant array and just emit code that stores the constant sorted result :P) http://gcc.godbolt.org/ is handy for that, and see also How to remove "noise" from GCC/clang assembly output?.

Wow, thanks a ton, I already reversed all my previous forays as I really want to learn reverse engineering, hence the x86 self-teaching. — Jishan, Jul 06 '18 at 00:40
@Jeet.Deir: I updated your question to be more specific and explain exactly where you were getting stuck (and thus a better fit for Stack Overflow). If this answer fully answers your question, you can "accept" it by clicking the checkmark to let everyone know it's solved. — Peter Cordes, Jul 06 '18 at 00:47
The question edit is prudent! Just checked the answer! Thanks again! — Jishan, Jul 06 '18 at 00:52

Keeping track of two different array indices in insertion sort with pointer increments vs. indexed addressing modes?

1 Answers1