1

So to teach myself loops and conditionals, I was trying to implement the Insertion Sort algorithm from Wikipedia:

i ← 1
while i < length(A)
    j ← i
    while j > 0 and A[j-1] > A[j]
        swap A[j] and A[j-1]
        j ← j - 1
    end while
    i ← i + 1
end while

I think I have translated the code to assembly well, yet it does not function. I think the reason for the problem probably is with indexing (specifically the part A[j or j - something]), and yet I am unable to fix it. Here is my code:

.386 
.model flat,stdcall
.stack 4096

ExitProcess proto,dwExitCode:dword

.data
arr DWORD 4,5,1,7


.code
main proc
mov esi, OFFSET arr
mov eax, 1 ; eax is i, i <- 1

L2:

    mov ebx, eax ; ebx is j, j <- i

    ; while j > 0
    INNERWHILE:
    cmp ebx, 0
    jle L3  

    ; and A[j-1] > A[j]
    mov ecx, [esi + 4*ebx] ; A[j]
    sub esi, 4
    cmp [esi + 4*ebx], ecx ;A[j-1] > A[j]
    jle L3

    ; swap A[j] and A[j-1]
    mov edx, [esi + 4*ebx] ; edx = A[j-1]
    add esi, 4
    mov ecx, [esi + 4*ebx] ; ecx = A[j]

    ; Begin Swap
    mov [esi + 4*ebx], edx ; A[j-1] <- A[j]
    sub esi, 4
    mov [esi + 4*ebx], ecx  ; A[j] <-  A[j-1]
    dec ebx
    jmp INNERWHILE
L3:
inc eax
cmp eax, LENGTHOF arr
jl L2
invoke ExitProcess,0
main ENDP
END main

Any help is appreciated, my level is a beginner in x86 so any tips are welcome.


This is the point where I am confused: High level language assume array starts from 0 and moves on. However A[j] != [esi] as j may not be = 0, therefore, I am totally stuck in the index based addressing part.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Jishan
  • 1,654
  • 4
  • 28
  • 62
  • `sbb esi, 4` doesn't make sense. You don't want to subtract an extra 1 if CF was set. You probably mean `sub`, not subtract-with-borrow? Also, why modify the array base as well as using indexed addressing modes? – Peter Cordes Jul 06 '18 at 00:17
  • @PeterCordes I meant `SUB` sorry. This is the point where I am confused. High level language assumes array starts from 0 and moves on. However A[j] != [esi] as j may not be = 0, therefore, I am totally stuck in the index based addressing part. – Jishan Jul 06 '18 at 00:24

1 Answers1

1

It looks like you're making things difficult for yourself trying to use esi as A+i and ebx as j.

The easy thing it to use three different registers total for A, i, and j, instead of trying to optimize away A+i into a single register.

So A[i] is [esi + edi*4] and A[j] is [esi + ebx*4].

That would be the direct translation of that pseudo-code into asm.


There are optimizations you could do later, once you get that working, like maybe optimize A+i and A+j into registers so you can use [reg] addressing modes instead of [reg+idx*4] addressing modes.

And you can maybe not keep A in a register at all, and only use it as a memory source operand for j>0, instead doing cmp edx, OFFSET arr, or cmp edx, [esp+0] if you pretend the base address isn't a compile time constant and push it on the stack.

And then j = i becomes a mov edx, esi.

You might want to translate the pseudocode to C and see what optimizing compilers do. (Write a sort function that takes a pointer as a function arg, so the compiler can't do constant propagation of a constant array and just emit code that stores the constant sorted result :P) http://gcc.godbolt.org/ is handy for that, and see also How to remove "noise" from GCC/clang assembly output?.

Community
  • 1
  • 1
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Wow, thanks a ton, I already reversed all my previous forays as I really want to learn reverse engineering, hence the x86 self-teaching. – Jishan Jul 06 '18 at 00:40
  • @Jeet.Deir: I updated your question to be more specific and explain exactly where you were getting stuck (and thus a better fit for Stack Overflow). If this answer fully answers your question, you can "accept" it by clicking the checkmark to let everyone know it's solved. – Peter Cordes Jul 06 '18 at 00:47
  • The question edit is prudent! Just checked the answer! Thanks again! – Jishan Jul 06 '18 at 00:52