x86 - How to insert values into an array using for loops in Inline Assembly

Question

One of my exercises I've been assigned is to set the values of an array from 0 to 9 and print the sum using as much inline assembly as possible. I have little experience with inline assembly and have exhausted my research on trying to find a solution.

This is the code I have so far. It compiles with no errors, however, when I try to run the application it just crashes. I know it is incomplete but I'm not sure if my logic is correct.

#include <stdio.h>

int main(int argc, char **argv)
{
    int *numbers;
    int index;

    __asm
    {
        // Set index value to 0
        mov index, 0
        // Jump to check
        jmp $CHECK_index
        // Increment the value of index by 1
        $INCREMENT_index:
            inc index
        // Check if index >= 9
        $CHECK_index:
            cmp index, 9
            jge $END_LOOP
            // Move index value to eax register
            mov eax, index
            // Move value of eax register to array
            mov numbers[TYPE numbers * eax], eax
            // Clean the stack
            add esp, 4

            jmp $INCREMENT_index

        $END_LOOP:
            add esp, 4
    }

    return 0;
}

there is no memory where numbers point to. call malloc or initialize numbers like this: int numbers[10]; — Quassel Kasper, Jan 28 '18 at 21:23
Related: [To Pass The Content Of The Pointer In An Inline Assembly Function On Visual Studio](https://stackoverflow.com/q/33767025) is about a function that takes a pointer arg, so you don't have the option to just use an array instead. — Peter Cordes, Jan 29 '22 at 22:56

Peter Cordes · Answer 1 · 2018-01-29T04:48:08.160

Use int numbers[10]; to give your asm an array like it's expecting, not a pointer.

And generally don't mess with esp inside inline asm. Or if you do, make sure esp has the same value at the end of the inline-asm block as it did at the start. You might have gotten away with having those add esp,4 insns in there for no reason, but only if the compiler threw away the old esp value with a leave or mov esp,ebp as part of tearing down a stack frame. Remove all your add esp,4, they shouldn't be there. (See the bottom of this answer for a simplified loop that only does what's necessary.)

You're clobbering stack memory next to your pointer value, so you probably crash when the function tries to return. (Use a debugger to see which instruction faults). You've overwritten the return address with a small integer, so code-fetch from an unmapped address causes a page-fault, if I've analyzed this correctly.

In C, arrays and pointers use the same syntax for [], but they are not the same. With a pointer, the compiler needs to get the pointer value into a register and then index relative to that. (And in inline asm, you have to do this yourself, but your code doesn't.) With an array, the indexing is relative to the array base address, and the compiler always knows where to find the array (automatic storage on the stack, or in static storage).

I'm simplifying a bit: a struct can contain an array, in which case it's a proper array type which doesn't "decay" to a pointer. (Related: What kind of C11 data type is an array according to the AMD64 ABI). So foo->arr[9] would be a deref of an actual array that doesn't have static or automatic storage so the compiler doesn't necessarily "already" have the base address for free.

Note that even a function arg declared as int foo(int arr[10]) is really a pointer, not an array. sizeof(arr) is 4 (with 32-bit pointers on x86), unlike if you declare it as a local variable inside the function.

This difference is important in assembly. mov numbers[TYPE numbers * eax], eax only does what you want if numbers is an array type, not a pointer type. Your asm is equivalent to (&numbers)[index] = (int*)index;, not numbers[index] = index;. This is how you're overwriting other stuff on the stack near where the pointer-value is stored.

In MSVC inline-asm, local variable names are assembled as [ebp+constant], so when numbers is an array, its elements are on the stack starting at numbers. But when numbers is a pointer, the pointer is on the stack at that location. You'd have to
mov edx, numbers / mov [edx + eax*TYPE numbers], eax to do what you want, if you used malloc or new to point numbers at some dynamically-allocated storage.

i.e. MSVC does not magically make asm syntax work like C pointer syntax, and couldn't efficiently do so because it would take an extra register (which your code might be using for something). You (unintentionally) wrote asm that overwrites the pointer value on the stack, and then overwrite another 9 DWORDs above that. That's something you can do with inline asm, so your code compiled with no warnings.

If you left numbers uninitialized, then (with proper pointer dereferencing) your code would almost certainly crash, for the same reason it would with compiler-generated code for int *numbers; numbers[0] = 0;. So yes, Paul's C++ new answer is partly correct and fixes that C bug, but doesn't fix the asm (lack of) pointer-dereference bug. If that makes it not crash, that's because the compiler is reserving more stack space before calling new, and it happens to be enough for you to scribble over stack memory without clobbering the return address, or something.

I tried looking at the asm from MSVC CL19 on the Godbolt compiler explorer, but that compiler version (with default options) only reserves a couple more DWORDs with int *numbers = new int[10];, not enough space for your code to avoid clobbering the return address when writing memory above &numbers. Presumably whatever compiler / version / options you're using emits different code which reserves more stack space so it avoids crashing, because you accepted that answer.

See source + asm on the Godbolt compiler explorer, for int numbers[10]; vs. int *numbers = new int[10]; vs. int *numbers;, all with no optimization options so they don't optimize anything away. The code from the inline-asm block is the same in all cases, except for the numeric constants like _numbers$ = -12 that the compiler uses as offsets from ebp to address local vars:

;; from the  int *numbers = new int[10];  version:

_numbers$ = -12                               ; size = 4
$T1 = -8                                                ; size = 4
_index$ = -4                                            ; size = 4

        mov      DWORD PTR _index$[ebp], 0
$$CHECK_index$3:
        cmp      DWORD PTR _index$[ebp], 9
        jge      SHORT $$END_LOOP$4
        mov      eax, DWORD PTR _index$[ebp]
        mov      DWORD PTR _numbers$[ebp+eax*4], eax   ; this is [ebp-12 + eax*4]
        inc      DWORD PTR _index$[ebp]
        jmp      SHORT $$CHECK_index$3
$$END_LOOP$4:

You might think you're already writing in asm, but looking at the compiler's actual asm output can help you find mistakes in using the asm syntax itself. (Or see what code the compiler generates before / after your code). Note that MSVC's "asm output" doesn't always match the machine code it puts into object files, unlike with gcc or clang. To be really sure, disassemble the object file or executable. (But then you mostly lose symbolic names, so it can be helpful to look at both.)

BTW, using inline asm is not the easiest way to learn asm in the first place. MSVC inline asm is sort of ok (unlike GNU C inline asm syntax, where you need to understand asm and compilers to properly describe your asm to the compiler), but not great, and has serious warts. Writing whole functions in pure asm and calling them from C is what I'd recommend for learning.

I'd also highly recommend just reading optimized compiler output for tiny functions, to see how to do various things in asm. See Matt Godbolt's CppCon2017 talk: “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”.

BTW, here's how I'd write your function (if I had to use MSVC inline asm (https://gcc.gnu.org/wiki/DontUseInlineAsm), and I didn't want to unroll or optimize with SSE2 or AVX2 SIMD...).

I keep the array index in eax, never spilling it to memory. Also, I restructure the loop into a do{}while() loop, because that's more natural, efficient, and idiomatic in asm. See Why are loops always compiled like this?.

void clean_version(void)
{
    int numbers[10];

    __asm
    {
        // index lives in eax
        xor eax,eax        // index = 0

        // The loop always runs at least once, so no check is needed before falling into the first iteration
        $store_loop:         //  do {
            // store index into the array
            mov numbers[TYPE numbers * eax], eax

            // Increment the value of index by 1
            inc   eax
            cmp   eax, 9      // } while(index<=9);
            jle   $store_loop
    }
}

Notice that the only store is into the array, and there are no loads. There are many fewer instructions in the loop. In this case (unlike usual), MSVC's limited asm syntax didn't actually impose any overhead for getting data into / out of the asm block, but it's still no better than what you'd get from optimized compiler output for a pure C loop. (Of course the loop would optimize away unless the array was volatile, if your function returns without doing anything with it.)

If you wanted to have a C variable holding index at the end of the loop, mov index, eax outside the loop. So logically index is live in eax inside the loop, and is only stored to memory afterwards. MSVC syntax provides a hacky way to return one value to C without storing it to memory where the compiler will have to reload it: leave the value in eax in an asm block at the end of a non-void function with no return statement. Apparently MSVC "understands" this and makes it work even when inlining such a function. But that only works for one single scalar value.

With optimization enabled, mov numbers[4*eax], eax may compile to mov [esp+constant + 4*eax], eax, i.e. relative to ESP instead of EBP. Or maybe not, IDK if MSVC always makes a stack frame in functions that use inline asm. Or if numbers was a static array, it would just be an absolute address (i.e. a link-time constant), so in the asm it would still just be the actual symbol name _numbers. (Because Windows prepends a leading _ to C names.)

Paul Carroll · Accepted Answer · 2018-01-29T02:37:54.830

1

It's been a LONG time (20+ years) since I wrote assembly, but isn't it the case that you need to allocate the numbers array? (can also move the inc to a cleaner spot)

#include <stdio.h>

int main(int argc, char **argv)
{
    int *numbers = new int[10];   // <--- Missing allocate
    int index;

    __asm
    {
        // Set index value to 0
        mov index, 0

        // Check if index >= 9
        $CHECK_index:
            cmp index, 9
            jge $END_LOOP

            // Move index value to eax register
            mov eax, index

            // Move value of eax register to array
            mov numbers[TYPE numbers * eax], eax

            // Increment the value of index by 1
            inc index           // <---- inc is cleaner here
            jmp $CHECK_index

        $END_LOOP:
    }

    return 0;
}

NOTE: Not sure why you need to move the stack pointer either (esp), but happy to admit that something I've forgotten in 20 years!

edited Jan 29 '18 at 02:37

answered Jan 28 '18 at 21:24

Paul Carroll

1,523
13
15

Thanks for your reply :) As for the first instance of the stack pointer, I have been trying to reference code from previous exercises so there may be extra unnecessary instructions. But your solution works! – Toby Cook Jan 28 '18 at 21:34
hahaha nice! i had to brush off some cobwebs there :) i'd recommend getting rid of the esp related instructions too... if memory serves even more then they came from an example where you had temporary stack variables or were possibly dealing with arguments being passed (which is not the case here) – Paul Carroll Jan 28 '18 at 21:37
@TobyCook: The stack-pointer increment inside the loop is totally bogus. If this program doesn't crash, it's only because it was compiled without optimization and ends with a stack-frame cleanup like `mov esp, ebp` / `pop ebp` to throw away the mess you made of the stack pointer, and there were no esp-relative memory accesses (to local variables) inside or after the asm. – Peter Cordes Jan 29 '18 at 01:01
Does this really work with `numbers` being an actual pointer to some `new` memory, rather than being an array like `int numbers[10]`? Obviously `mov numbers[4*eax], eax` doesn't actually assemble directly, because it's not a symbol, it's a local variable name. So either this compiles and runs and scribbles all over the stack near where the pointer is stored, or MSVC does some magic. Also, keeping `index` in memory instead of a register seems totally pointless, but at least there's no correctness problem there. – Peter Cordes Jan 29 '18 at 01:04
1

@PeterCordes I have only started learning assembly so I apologize if I'm not an expert, just as my original post states. I have since cleaned my code up and it works just fine, but no need to come down so hard on me. I'm only a beginner. – Toby Cook Jan 29 '18 at 02:18
1

@TobyCook don't take it too much to heart mate. Unfortunately the CS industry (especially SO) thrives on denigrating comments rather than support and mentorship... I prefer the latter, but don't be surprised when you encounter the former – Paul Carroll Jan 29 '18 at 02:36
@TobyCook: Sorry if that sounded like a personal attack. It's normal to write totally bogus code when you're first learning something! I thought it was useful to point out exactly what your code was doing, and why it might actually run without crashing even though I think it's not doing what you want (and would crash or cause problems in other contexts). Anyway, I was partly addressing Paul with the 2nd comment, because he suggested using `new` instead of making it an array. (I think `int numbers[10]` will lead to your `numbers` assembling to `mov [ebp+constant + eax*4]`and being correct) – Peter Cordes Jan 29 '18 at 02:37
1

@TobyCook: Anyway, I think it *is* important to come down hard on wrong code that might confuse or mislead future readers. Remember, it's the *code* that's being "attacked", not the person. I worded it as "does this really work" because I don't use MSVC or it's inline asm syntax; if I knew for sure it was wrong I would have just edited. Sorry again for giving offense with my choice of wording; hope this helps you sort out exactly what the code is doing. (BTW, try looking at disassembly of the function, including compiler-generated asm and disassembly of what your inline-asm turned into.) – Peter Cordes Jan 29 '18 at 02:45
@Toby and Paul: I got curious about what exactly MSVC does with pointers vs. arrays as inputs to inline asm, and it turns out that this answer doesn't "work", or if it does avoid crashing, it's by chance and the stores don't actually go into the array you allocated with `new`. See my answer for more details about pointers vs. arrays, which I think is the main thing you were missing. – Peter Cordes Jan 29 '18 at 04:29
1

wow @PeterCordes that's quite an amazing write up mate, well done! i'll upvote as i definitely consider that a brilliant response worthy of the accepted answer... great work! :) – Paul Carroll Jan 29 '18 at 04:36

x86 - How to insert values into an array using for loops in Inline Assembly

2 Answers2