Intel VEX prefix, L bit value does not behave according to docs

Question

Intel instruction set reference gives us addsd instruction:

VEX.NDS.LIG.F2.0F.WIG 58 /r
VADDSD xmm1, xmm2, xmm3/m64

As we can see L bit is ignored (can be either 0 or 1).

Machine code of addsd xmm0, xmm0, xmm0: 0xC4, 0xE1, 0x7B, 0x58, 0xC0

C4 - indicates 3-byte VEX prefix
E1 - R = 1; X = 1; B = 1; m-mmmm = 1 (implied 0F escape)
7B - W = 0; vvvv = 1111 (xmm0); L = 0; pp = 11 (implied F2 prefix)
58 - opcode byte
C0 - mod-rm byte

Let's test:

void exec(Byte* code, int size)
{
    Byte* buf = (Byte*)VirtualAlloc(NULL, 4096, MEM_COMMIT, PAGE_EXECUTE_READWRITE);

    memcpy(buf, code, size);

    buf[size] = 0xC3;

    ((void (*)())buf)();

    VirtualFree(buf, 4096, MEM_DECOMMIT);
}

void f()
{
    Byte code[] = { 0xC4, 0xE1, 0x7B, 0x58, 0xC0 };

    exec(code, sizeof(code));
}

Fine, also visual studio disassembler recognizes the instruction.

However when I change L bit to 1 (0x7B is replaced by 0x7F) disassembler does not recognize the instruction and Invalid Instruction exception is generated. Does it mean that L bit must always be 0 despite Intel manual?

If you want to test for byte sequences being valid instructions, it's a lot easier to just put them in a `.asm` and assemble it, like `_start: db 0xC4, 0xE1, 0x7B, 0x58, 0xC0`. Then you just assemble and run it. More importantly, disassemblers will happily operate on your bytes, because they're in a part of your object file that's supposed to hold code. It looks like your method works fine; it's just overcomplicated. — Peter Cordes, Sep 05 '16 at 13:47
With Visual C++, you could also use `#pragma code_seg(".text")` and `unsigned char const __declspec(allocate(".text")) code[] = { 0xC4, 0xE1, 0x7B, 0x58, 0xC0, 0xC3 };` With GCC you can just use `unsigned char const __attribute__((section(".text"))) code[] = { 0xC4, 0xE1, 0x7B, 0x58, 0xC0, 0xC3 };`. — Ross Ridge, Sep 05 '16 at 16:04
@PeterCordes : Do you know if the `illegal-instruction` tag was undergoing burnination. I notice that @tkausl removed it from all the questions. I know you have added it to questions in the past. — Michael Petch, Sep 10 '16 at 11:09
@MichaelPetch: Wondered the same thing. IDK if it was adding much value, but it didn't seem like a problematic tag to me. — Peter Cordes, Sep 10 '16 at 11:10
@PeterCordes I didn't think it was either, although it may have lacked a tag wiki description. I just happened to wonder why all of a sudden a pile of questions moved to recently active. — Michael Petch, Sep 10 '16 at 11:12
@tkausl: Why did you remove the `illegal-instruction` tag from every question without first asking on meta about doing so? (Or did you? I searched but didn't find any hits). We should probably make a chat instead of using comments here. — Peter Cordes, Sep 10 '16 at 23:15

Peter Cordes · Accepted Answer · 2016-09-05T14:59:16.853

It looks like LIG doesn't really mean the L bit is ignored; that part of the manual is wrong. In practice it's actually a synonym for .LZ or .128 and means L must be 0.

You're right that Intel's insn ref manual (Section 3.1.1.2 (Opcode Column in the Instruction Summary Table (Instructions with VEX prefix) of volume 2 of the x86 manuals) contradicts observed behaviour:

If VEX.LIG is present in the opcode column: The VEX.L value is ignored. This generally applies to VEX-encoded scalar SIMD floating-point instructions.

However, it also contradicts other documentation in the same manual. Intel's manuals do have occasional mistakes. :( I think you can report bugs on Intel's forum.

Presumably Intel changed their mind about ignoring the bit, and decided to keep the L=1 encoding of scalar opcodes reserved, but forgot to update the docs for what VEX.LIG means in the insn-encoding section.

They publish future-extensions updates to the insn set reference manual before they become official, probably before every detail of hardware design is finalized. (The current future-extensions supplemental pdf describes AVX512 instructions (found in KNL), and a few other extensions that aren't in the official manual yet, or available in any commercially-available silicon AFAIK.) (Links to Intel's docs page, and tons of other stuff, in the x86 tag wiki).

From Intel's insn ref manual, Fig2-9 VEX bit fields:

L: Vector Length

scalar or 128-bit vector

256-bit vector

Section 2.3.6.2 explains the same thing.

Note that some BMI1/2 instructions use VEX encodings, also with L=0. It looks like they indicate it with .Lz: VEX.NDS.LZ.0F38.W0 F2 /r is ANDN r32a, r32b, r/m32.

You talked about running asm file. Do you mean that I need to export label _start and jump to that address from C program? — igntec, Sep 05 '16 at 14:31
@igntec: no, `_start` is the default entry point for the linker. So the code at `_start` is literally the first instruction that your program runs, no CRT startup code or anything. On Windows, I think you can `ret` from `_start`, because the OS puts `exit` as a return address on the stack. Set a breakpoint at `_start`, run your program, and single-step. It's probably easier to define a `main:` in asm, though. I'm just used to creating bare static binaries on Linux to test trivial things, so I tend to define `_start`. related: http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html. — Peter Cordes, Sep 05 '16 at 14:39
@igntec: There's a whole [x86 tag wiki](http://stackoverflow.com/tags/x86/info) full of useful links that I've collected, if you want to look at more stuff :) — Peter Cordes, Sep 05 '16 at 14:58

Intel VEX prefix, L bit value does not behave according to docs

1 Answers1