0

I'm new to C and Assembly programming, and I want to implement merge sort algorithm in C. A section of my C code should call Assembly function which will perform the merge action and return a sorted array to my C program. I have been able to write my entire merge sort algorithm in C and wanted to convert merge action to Assembly code but I don't know how to write it. Most importantly, how to link my Assembly file (.asm) to my C file (.c) so that when I run my C program it will execute the Assembly code.


My C code with Merge sort algorithm:

#include <stdio.h>

extern int * merge(int *left, int *right, unsigned long nLeft, unsigned long nRight, int *array);
//void merge(int const left[], int const right[], long nLeft, long nRight, int array[]) {
//    int i = 0, j = 0, k = 0;
//    while (i < nLeft && j < nRight) {
//        if (left[i] <= right[j]) {
//            array[k] = left[i];
//            i++;
//        } else {
//            array[k] = right[j];
//            j++;
//        }
//        k++;
//    }
//    while (i < nLeft) {
//        array[k] = left[i];
//        i++;
//        k++;
//    }
//    while (j < nRight) {
//        array[k] = right[j];
//        j++;
//        k++;
//    }
//}

void printArray(int arr[], int size) {
    for (int i = 0; i < size; ++i) {
        printf("%d ", arr[i]);
    }
}

void mergeSort(int arr[], unsigned long arrSize) {
    if (arrSize < 2) {
        return;
    }

    int* arrayPointer;

    unsigned long mid = arrSize / 2;
    int left[mid];
    int right[arrSize - mid];
    for (int i = 0; i < mid; i++)
    left[i] = arr[i];

    for (int i = (int) mid; i < arrSize; i++)
        right[i - mid] = arr[i];

    mergeSort(left, sizeof(left) / sizeof(int));
    mergeSort(right, sizeof(right) / sizeof(int));
    arrayPointer = merge(left, right, sizeof(left) / sizeof(int), sizeof(right) / sizeof(int), arr);
    printf("Pointer %p\n", arrayPointer);
}

int main() {
    int array[] = {3, 8, 0, 2, 1, 7, 6, 4, 9, 5};
    printf("Before merge sort: \n");
    printArray(array, sizeof(array) / sizeof(int));
    mergeSort(array, sizeof(array) / sizeof(int));
    printf("\nAfter merge sort: \n");
    printArray(array, sizeof(array) / sizeof(int));
    return 0;
}

the commented part in the program above is what I want to implement in Assembly language (another file).

My Assembly code (in SystemV x86-64 ABI instruction):

; this is the merge function in assembly
section .text
  global merge

merge:
.both:  cmp i, rdx
        jns .and
.and:   cmp j, rcx
        jns .checkif
.checkif: cmp [rdi + i * 4], [rsi + j * 4] //error
        jns .doif
        js .doelse
.doif:  mov [r8 + k * 4], [rdi + i * 4]//error
        inc i
        inc k
        jmp .both
.doelse: mov [r8 + k * 4], [rsi + j * 4]//error
        inc j
        inc k
        jmp .both

.doleft: cmp i, rdx
        jns .doleftwhile
.doleftwhile: mov [r8 + k * 4], [rdi + i * 4]//error
        inc i
        inc k
        jmp .doleft

.doright: cmp j, rcx,
        jns .dorightwhile
.dorightwhile: mov [r8 + k * 4], [rsi + j * 4] // error
        inc j
        inc k
        jmp .doright

        mov rax, [rdx + rcx]
        ret

section .data
i dd 0
j dd 0
k dd 0

I updated the assembly code and I'm getting the error below:

asm.s:10: error: invalid effective address: impossible segment base multiplier
asm.s:13: error: invalid effective address: impossible segment base multiplier
asm.s:17: error: invalid effective address: impossible segment base multiplier
asm.s:24: error: invalid effective address: impossible segment base multiplier
asm.s:31: error: invalid effective address: impossible segment base multiplier
make: *** [Makefile:25: asm.o] Error 1

Please how do I save each result (element) into a specific index from my code?
PS: sorry for the multiple jumps in the code.

Bille Ibinabo
  • 73
  • 2
  • 9
  • Assemble your assembly code (e.g. `nasm -felf64/win64 merge.asm -o merge.o/obj`) and include the object file in the compilation (e.g. `gcc ... merge.o ...`). You already made `merge` global so it's visible to other compilation units and you already declared `merge` in the C code, so... it's done! You only have to implement it now ;) – Margaret Bloom Nov 05 '22 at 13:14
  • Stack Overflow isn't great for "help me get started" questions. You'll need to make some substantial progress with writing the assembly code; then you can ask specific questions here and have a chance to get good answers. So far it doesn't appear you have really written any. If you are not already quite familiar with assembly language, then I would suggest starting with some shorter, easier exercises; a merge sort function, even though only a few lines of C, becomes a substantial project in assembly. – Nate Eldredge Nov 05 '22 at 14:28
  • @NateEldredge I want to use GNU Assembler in Intel mode. – Bille Ibinabo Nov 05 '22 at 14:39
  • Then you can use `gcc -S -masm=intel -Og foo.c -o foo.S` to make an example for you. See also [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116). In the standard calling convention, RAX/EAX/AX/AL is the return value register, with width depending on the type being returned. RAX for pointers, e.g. `mov rax, rsp` to return the stack pointer if you want to do that for some bizarre reason. – Peter Cordes Nov 05 '22 at 22:24
  • See [What registers are preserved through a linux x86-64 function call](https://stackoverflow.com/q/18024672) / [What are the calling conventions for UNIX & Linux system calls (and user-space functions) on i386 and x86-64](https://stackoverflow.com/q/2535989) – Peter Cordes Nov 05 '22 at 22:24
  • Building is trivial: `gcc -Wall foo.c merge.S -o foo`. You're using GAS `.intel_syntax`, so GCC knows how to assemble it for you. (I'd recommend NASM, though; it has more helpful and accurate error messages when you try to write something that's impossible in machine code, or that violates asm syntax rules.) – Peter Cordes Nov 05 '22 at 22:26
  • Your current attempt has many problems trying to use things that are impossible in x86 machine code. Remember, NASM is an assembler, not a compiler. [Why isn't movl from memory to memory allowed?](https://stackoverflow.com/q/33794169) / [Why mov/cmp instead of cmp with two memory operands?](https://stackoverflow.com/q/68655052). – Peter Cordes Nov 10 '22 at 23:36
  • Also syntax for what you mean: [Basic use of immediates vs. square brackets in YASM/NASM x86 assembly](https://stackoverflow.com/q/10362511) - I think you mean `cmp [i], edx`. `cmp i, rdx` is comparing the address, but cmp only works with an immediate as the right hand operand. And of course `inc i` can't work, `inc dword ptr [i]` would. **But use registers instead of static storage, that's what they're for!** – Peter Cordes Nov 10 '22 at 23:37
  • x86 doesn't have memory-indirect addressing, so you can't get it to load a value from `.data` and use that to index another array all in one instruction. And as discussed earlier, plain `i` is the address, so `[rdi + i * 4]` is telling it to scale the address and add that to RDI, but NASM doesn't know how to do that; the linker doesn't support it. (And it's not what you want. That problem will go away when you use registers for your indices. Along with the size mismatch where you used `i: dd 0` to reserve 4 bytes, but you were using 8-byte registers like RDX to compare with it.) – Peter Cordes Nov 10 '22 at 23:39
  • @PeterCordes Please, how do I assign a value in an arbitrary index of an array to an index in another array with assembly e.g. `A[i] = B[j]`? – Bille Ibinabo Nov 12 '22 at 03:49
  • Like I said, write it in C and have a C compiler make an example for you: [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) . Load into a temporary register, then store. If you have `B` and `j` in registers, you might do `mov eax, [rdi + rcx*4]` for the load, if `B` is an array of 4-byte objects like `int`. – Peter Cordes Nov 12 '22 at 04:01

0 Answers0