Use int numbers[10];
to give your asm an array like it's expecting, not a pointer.
And generally don't mess with esp
inside inline asm. Or if you do, make sure esp
has the same value at the end of the inline-asm block as it did at the start. You might have gotten away with having those add esp,4
insns in there for no reason, but only if the compiler threw away the old esp
value with a leave
or mov esp,ebp
as part of tearing down a stack frame. Remove all your add esp,4
, they shouldn't be there. (See the bottom of this answer for a simplified loop that only does what's necessary.)
You're clobbering stack memory next to your pointer value, so you probably crash when the function tries to return. (Use a debugger to see which instruction faults). You've overwritten the return address with a small integer, so code-fetch from an unmapped address causes a page-fault, if I've analyzed this correctly.
In C, arrays and pointers use the same syntax for []
, but they are not the same. With a pointer, the compiler needs to get the pointer value into a register and then index relative to that. (And in inline asm, you have to do this yourself, but your code doesn't.) With an array, the indexing is relative to the array base address, and the compiler always knows where to find the array (automatic storage on the stack, or in static storage).
I'm simplifying a bit: a struct can contain an array, in which case it's a proper array type which doesn't "decay" to a pointer. (Related: What kind of C11 data type is an array according to the AMD64 ABI). So foo->arr[9]
would be a deref of an actual array that doesn't have static or automatic storage so the compiler doesn't necessarily "already" have the base address for free.
Note that even a function arg declared as int foo(int arr[10])
is really a pointer, not an array. sizeof(arr)
is 4
(with 32-bit pointers on x86), unlike if you declare it as a local variable inside the function.
This difference is important in assembly. mov numbers[TYPE numbers * eax], eax
only does what you want if numbers
is an array type, not a pointer type. Your asm is equivalent to (&numbers)[index] = (int*)index;
, not numbers[index] = index;
. This is how you're overwriting other stuff on the stack near where the pointer-value is stored.
In MSVC inline-asm, local variable names are assembled as [ebp+constant]
, so when numbers
is an array, its elements are on the stack starting at numbers
. But when numbers
is a pointer, the pointer is on the stack at that location. You'd have to
mov edx, numbers
/ mov [edx + eax*TYPE numbers], eax
to do what you want, if you used malloc
or new
to point numbers
at some dynamically-allocated storage.
i.e. MSVC does not magically make asm syntax work like C pointer syntax, and couldn't efficiently do so because it would take an extra register (which your code might be using for something). You (unintentionally) wrote asm that overwrites the pointer value on the stack, and then overwrite another 9 DWORDs above that. That's something you can do with inline asm, so your code compiled with no warnings.
If you left numbers
uninitialized, then (with proper pointer dereferencing) your code would almost certainly crash, for the same reason it would with compiler-generated code for int *numbers; numbers[0] = 0;
. So yes, Paul's C++ new
answer is partly correct and fixes that C bug, but doesn't fix the asm (lack of) pointer-dereference bug. If that makes it not crash, that's because the compiler is reserving more stack space before calling new
, and it happens to be enough for you to scribble over stack memory without clobbering the return address, or something.
I tried looking at the asm from MSVC CL19 on the Godbolt compiler explorer, but that compiler version (with default options) only reserves a couple more DWORDs with int *numbers = new int[10];
, not enough space for your code to avoid clobbering the return address when writing memory above &numbers
. Presumably whatever compiler / version / options you're using emits different code which reserves more stack space so it avoids crashing, because you accepted that answer.
See source + asm on the Godbolt compiler explorer, for int numbers[10];
vs. int *numbers = new int[10];
vs. int *numbers;
, all with no optimization options so they don't optimize anything away. The code from the inline-asm block is the same in all cases, except for the numeric constants like _numbers$ = -12
that the compiler uses as offsets from ebp
to address local vars:
;; from the int *numbers = new int[10]; version:
_numbers$ = -12 ; size = 4
$T1 = -8 ; size = 4
_index$ = -4 ; size = 4
mov DWORD PTR _index$[ebp], 0
$$CHECK_index$3:
cmp DWORD PTR _index$[ebp], 9
jge SHORT $$END_LOOP$4
mov eax, DWORD PTR _index$[ebp]
mov DWORD PTR _numbers$[ebp+eax*4], eax ; this is [ebp-12 + eax*4]
inc DWORD PTR _index$[ebp]
jmp SHORT $$CHECK_index$3
$$END_LOOP$4:
You might think you're already writing in asm, but looking at the compiler's actual asm output can help you find mistakes in using the asm syntax itself. (Or see what code the compiler generates before / after your code). Note that MSVC's "asm output" doesn't always match the machine code it puts into object files, unlike with gcc or clang. To be really sure, disassemble the object file or executable. (But then you mostly lose symbolic names, so it can be helpful to look at both.)
BTW, using inline asm is not the easiest way to learn asm in the first place. MSVC inline asm is sort of ok (unlike GNU C inline asm syntax, where you need to understand asm and compilers to properly describe your asm to the compiler), but not great, and has serious warts. Writing whole functions in pure asm and calling them from C is what I'd recommend for learning.
I'd also highly recommend just reading optimized compiler output for tiny functions, to see how to do various things in asm. See Matt Godbolt's CppCon2017 talk: “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”.
BTW, here's how I'd write your function (if I had to use MSVC inline asm (https://gcc.gnu.org/wiki/DontUseInlineAsm), and I didn't want to unroll or optimize with SSE2 or AVX2 SIMD...).
I keep the array index in eax
, never spilling it to memory. Also, I restructure the loop into a do{}while()
loop, because that's more natural, efficient, and idiomatic in asm. See Why are loops always compiled like this?.
void clean_version(void)
{
int numbers[10];
__asm
{
// index lives in eax
xor eax,eax // index = 0
// The loop always runs at least once, so no check is needed before falling into the first iteration
$store_loop: // do {
// store index into the array
mov numbers[TYPE numbers * eax], eax
// Increment the value of index by 1
inc eax
cmp eax, 9 // } while(index<=9);
jle $store_loop
}
}
Notice that the only store is into the array, and there are no loads. There are many fewer instructions in the loop. In this case (unlike usual), MSVC's limited asm
syntax didn't actually impose any overhead for getting data into / out of the asm
block, but it's still no better than what you'd get from optimized compiler output for a pure C loop. (Of course the loop would optimize away unless the array was volatile
, if your function returns without doing anything with it.)
If you wanted to have a C variable holding index
at the end of the loop, mov index, eax
outside the loop. So logically index
is live in eax
inside the loop, and is only stored to memory afterwards. MSVC syntax provides a hacky way to return one value to C without storing it to memory where the compiler will have to reload it: leave the value in eax
in an asm block at the end of a non-void
function with no return
statement. Apparently MSVC "understands" this and makes it work even when inlining such a function. But that only works for one single scalar value.
With optimization enabled, mov numbers[4*eax], eax
may compile to mov [esp+constant + 4*eax], eax
, i.e. relative to ESP instead of EBP. Or maybe not, IDK if MSVC always makes a stack frame in functions that use inline asm. Or if numbers
was a static array, it would just be an absolute address (i.e. a link-time constant), so in the asm it would still just be the actual symbol name _numbers
. (Because Windows prepends a leading _
to C names.)