7

For testing purposes, I am writing short assembly snippets for Intel's Xeon Phi with the Icc inline assembler. Now I wanted to use masked vector instructions, but I fail at feeding them to the inline assembler.

For code like this:

vmovapd  -64(%%r14, %%r10), %%zmm0{%%k1} 

I get the error message

/tmp/icpc5115IWas_.s: Assembler messages:
/tmp/icpc5115IWas_.s:563: Error: junk `%k1' after register

I tried a lot of different combinations, but nothing worked. The compiler version is intel64/13.1up03 under Linux, using GAS syntax.

Edit: The code above actually works with non-extended assembler. So this:

__asm__("vmovapd  -64(%r14, %r10), %zmm0{%k1} ")

works, while the following does not:

__asm__("vmovapd  -64(%[src], %%r10), %%zmm0{%%k1} "
    :
    : [src]"r"(src)
    :)

I guess it has something to do with the necessity to use a double % before register names in extended mode. But no, a single % for the k does not work either.

Grigory Rechistov
  • 2,104
  • 16
  • 25
user116429
  • 131
  • 7

2 Answers2

6

I asked the same question in the Intel Developer zone http://software.intel.com/en-us/forums/topic/499145#comment-1776563, the answer is, that in order to use the mask registers on the Xeon Phi in extended inline assembler, you have to use double curly braces around the mask register modifier.

vmovapd     %%zmm30,         (%%r15,    %%r10){{%%k1}}
user116429
  • 131
  • 7
  • Regular curly braces in GNU C inline asm are for syntax dialect alternatives, like `add {%0, %1 | %1, %0}` to write code that works with either AT&T or Intel, so you can compile it with or without `-masm=intel`. – Peter Cordes Mar 19 '18 at 23:45
  • Also, the recommended way is to escape the `{` as `%{`, e.g. `"... %{%%k1%} \n"` – Peter Cordes Feb 25 '21 at 04:25
0

I think you need to use the masked variant of the instruction: VMASKMOVPD

pburka
  • 1,434
  • 9
  • 12
  • VMASKMOVPD is only for AVX, and not for KNI. They have not included it, because there is the universal vector lane masking functionality. – user116429 Jan 11 '14 at 08:37
  • I don't understand what you mean. vmovapd and vmaskmovpd are both AVX512 instruction. I don't know what KNI is in this context - the only Intel use of this TLA with which I'm familiar is Kernel NIC Interface. – pburka Jan 11 '14 at 23:11
  • KNI are Knights Corner New Instructions, the vector instruction set of the Xeon Phi. AVX512 is pretty similar, and both instruction sets will probably converge in the future. – user116429 Jan 13 '14 at 11:28