4

I'm using a late model i7 cpu that supports avx and avx2, and supposedly virtualbox supports avx and avx2 so that being the case why does the following code hang?

vmovdqa    ymm0, qqword[testmem]

testmem is defined elsewhere as

align 32
testmem:   rb   128

If I use

movdqa    xmm0, dqword[testmem]

It works fine.

FASM 1.72, windows 10, i7-7700hq, virtualbox 5.2.6

EDIT: It's a UEFI application (so obviously running in 64 bit mode) that works fine except if the above instruction appears.

EDIT

Tried adding

  mov     rcx, 0
  xgetbv
  or      rax, 0007h
  xsetbv  

At the start of the code, but didn't help.

poby
  • 1,572
  • 15
  • 39
  • 1
    Define what you mean by "hang". Also, in what context are you running it? – Jester Feb 05 '18 at 01:48
  • Is AVX2 actually enabled in virtualbox? – harold Feb 05 '18 at 01:50
  • "hang" means the program runs up that point and then doesn't display subsequent prints to the screen or complete. It's a UEFI application that works fine except if I try to use the above avx instruction. – poby Feb 05 '18 at 01:51
  • My understanding is that the current version of virtualbox has avx2 enabled by default but I have tried enabling it on the commandline but didn't help – poby Feb 05 '18 at 01:52
  • 3
    No idea what you have available under UEFI. Make sure AVX is enabled properly, in particular the `XCR0` register as it will `#UD` if _XCR0[2:1] != 11b or CR4.OSXSAVE[bit 18]=0_. – Jester Feb 05 '18 at 02:01
  • what's the guest OS? [Virtualbox supports AVX2 since 5.0 beta 3](https://stackoverflow.com/a/30299294/995714) so I think 5.2.6 won't have trouble running it – phuclv Feb 05 '18 at 03:03
  • @LưuVĩnhPhúc There is no OS, it's a UEFI application so running before any OS is started. SSE is enabled already or the `movdqa` wouldn't work. I have tried enabling AVX (see above edit) but still doesn't work. – poby Feb 05 '18 at 03:14

1 Answers1

4

Ok found the answer. I know this is a rather esoteric question but just in case it helps someone else here is how to enable AVX

mov rax, cr4
or eax, 0x40000              ; bit 18 for oxsave bit
mov cr4, rax

xor     rcx, rcx
xgetbv
or      rax, 6
xsetbv    

What I was missing was setting bit 18 of the CR4 register which enables OSXSAVE, a requirement prior to enabling AVX.

poby
  • 1,572
  • 15
  • 39
  • 2
    See also https://stackoverflow.com/questions/31563078/how-do-i-enable-sse-for-my-freestanding-bootable-code for links to wiki.osdev.org. – Peter Cordes Feb 05 '18 at 05:20
  • 1
    >this is a rather esoteric question All questions about an assembly language are somewhat esoteric to the outsiders of low-level hacking. However, this particular behavior is documented in the Intel SDM. Basically what you see is an exception because the opt-in procedure for AVX was not followed – Grigory Rechistov Feb 05 '18 at 08:56
  • don't use `xor rcx, rcx`. [Use `xor ecx, ecx`](https://stackoverflow.com/q/33666617/995714) instead – phuclv Jul 28 '18 at 07:23