0

Im trying to compile an assembly file that with nasm and to link it with golink. The file is very simple and contains only one call to an SSE command.

When I'm compile and link the file without the SSE command the executable runs properly, but when the file contains the SSE command the program is crashed, so I'm assuming that the problem is somewhere in the use of that command.

Here is the code - hello.asm:

extern malloc
global main
section .text
main:
push rbp            ; Save the stack
mov rbp, rsp
push rax            ; Save the registers

push 1024;
call malloc
and  rax,0xFFFFFFFFFFFFF000

movntdq [rax], xmm5;

pop rax
mov rsp,rbp
pop rbp

mov rax,0
ret

That code compiled with: (yasm - same problem)

nasm -f win64 hello.asm -o hello.obj

and linked with:

golink.exe /console /entry main hello.obj MSVCRT.dll kernel32.dll

The output is hello.exe, that crash each time it runs.

What's wrong here?

Thanks in advance !

AK87
  • 613
  • 6
  • 24
  • 1
    `and rax` and using rax as destination address looks suspicious . Try _aligned_malloc instead. – Alex F Oct 19 '14 at 15:09
  • Just tried, didn't change anything... – AK87 Oct 19 '14 at 15:14
  • 1
    Run it under the debugger and see where it fails, examine state and determine why. – Chris Stratton Oct 19 '14 at 15:28
  • Here is another idea. How about your write the code in C with intrinsics (`_mm_malloc`, `_mm_streap_ps`) and then look at the assembly it produes. – Z boson Oct 19 '14 at 17:33
  • Are you using intel, or AT&T syntax ? – User.1 Oct 19 '14 at 18:54
  • 2
    Your first and biggest problem is the way you pass the parameter to `malloc`. In 64bit land, we don't pass parameters on the stack but in registers. Since you are using Windows, the parameter for malloc should be in rcx. – Gunner Oct 20 '14 at 03:30
  • Thanks Gunner !! That was the problem !! – AK87 Oct 20 '14 at 07:37
  • Rounding down the return value of `malloc` isn't safe: if it wasn't aligned, you'll be storing to memory before the start of the allocation. Use `aligned_alloc` or `_mm_malloc` / `_mm_free` or similar functions, or round *up*. And BTW, you were effectively doing `malloc(argc)` because your `main` didn't modify RCX. – Peter Cordes Jun 04 '18 at 01:21

0 Answers0