3

Intel instruction set reference gives us addsd instruction:

VEX.NDS.LIG.F2.0F.WIG 58 /r
VADDSD xmm1, xmm2, xmm3/m64

As we can see L bit is ignored (can be either 0 or 1).

Machine code of addsd xmm0, xmm0, xmm0: 0xC4, 0xE1, 0x7B, 0x58, 0xC0

C4 - indicates 3-byte VEX prefix
E1 - R = 1; X = 1; B = 1; m-mmmm = 1 (implied 0F escape)
7B - W = 0; vvvv = 1111 (xmm0); L = 0; pp = 11 (implied F2 prefix)
58 - opcode byte
C0 - mod-rm byte

Let's test:

void exec(Byte* code, int size)
{
    Byte* buf = (Byte*)VirtualAlloc(NULL, 4096, MEM_COMMIT, PAGE_EXECUTE_READWRITE);

    memcpy(buf, code, size);

    buf[size] = 0xC3;

    ((void (*)())buf)();

    VirtualFree(buf, 4096, MEM_DECOMMIT);
}

void f()
{
    Byte code[] = { 0xC4, 0xE1, 0x7B, 0x58, 0xC0 };

    exec(code, sizeof(code));
}

Fine, also visual studio disassembler recognizes the instruction.

However when I change L bit to 1 (0x7B is replaced by 0x7F) disassembler does not recognize the instruction and Invalid Instruction exception is generated. Does it mean that L bit must always be 0 despite Intel manual?

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
igntec
  • 1,006
  • 10
  • 24
  • 1
    If you want to test for byte sequences being valid instructions, it's a lot easier to just put them in a `.asm` and assemble it, like `_start: db 0xC4, 0xE1, 0x7B, 0x58, 0xC0`. Then you just assemble and run it. More importantly, disassemblers will happily operate on your bytes, because they're in a part of your object file that's supposed to hold code. It looks like your method works fine; it's just overcomplicated. – Peter Cordes Sep 05 '16 at 13:47
  • 1
    With Visual C++, you could also use `#pragma code_seg(".text")` and `unsigned char const __declspec(allocate(".text")) code[] = { 0xC4, 0xE1, 0x7B, 0x58, 0xC0, 0xC3 };` With GCC you can just use `unsigned char const __attribute__((section(".text"))) code[] = { 0xC4, 0xE1, 0x7B, 0x58, 0xC0, 0xC3 };`. – Ross Ridge Sep 05 '16 at 16:04
  • @PeterCordes : Do you know if the `illegal-instruction` tag was undergoing burnination. I notice that @tkausl removed it from all the questions. I know you have added it to questions in the past. – Michael Petch Sep 10 '16 at 11:09
  • 1
    @MichaelPetch: Wondered the same thing. IDK if it was adding much value, but it didn't seem like a problematic tag to me. – Peter Cordes Sep 10 '16 at 11:10
  • 1
    @PeterCordes I didn't think it was either, although it may have lacked a tag wiki description. I just happened to wonder why all of a sudden a pile of questions moved to recently active. – Michael Petch Sep 10 '16 at 11:12
  • @tkausl: Why did you remove the `illegal-instruction` tag from every question without first asking on meta about doing so? (Or did you? I searched but didn't find any hits). We should probably make a chat instead of using comments here. – Peter Cordes Sep 10 '16 at 23:15

1 Answers1

2

It looks like LIG doesn't really mean the L bit is ignored; that part of the manual is wrong. In practice it's actually a synonym for .LZ or .128 and means L must be 0.

You're right that Intel's insn ref manual (Section 3.1.1.2 (Opcode Column in the Instruction Summary Table (Instructions with VEX prefix) of volume 2 of the x86 manuals) contradicts observed behaviour:

If VEX.LIG is present in the opcode column: The VEX.L value is ignored. This generally applies to VEX-encoded scalar SIMD floating-point instructions.

However, it also contradicts other documentation in the same manual. Intel's manuals do have occasional mistakes. :( I think you can report bugs on Intel's forum.


Presumably Intel changed their mind about ignoring the bit, and decided to keep the L=1 encoding of scalar opcodes reserved, but forgot to update the docs for what VEX.LIG means in the insn-encoding section.

They publish future-extensions updates to the insn set reference manual before they become official, probably before every detail of hardware design is finalized. (The current future-extensions supplemental pdf describes AVX512 instructions (found in KNL), and a few other extensions that aren't in the official manual yet, or available in any commercially-available silicon AFAIK.) (Links to Intel's docs page, and tons of other stuff, in the tag wiki).


From Intel's insn ref manual, Fig2-9 VEX bit fields:

L: Vector Length

  1. scalar or 128-bit vector
  2. 256-bit vector

Section 2.3.6.2 explains the same thing.


Note that some BMI1/2 instructions use VEX encodings, also with L=0. It looks like they indicate it with .Lz: VEX.NDS.LZ.0F38.W0 F2 /r is ANDN r32a, r32b, r/m32.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • You talked about running asm file. Do you mean that I need to export label _start and jump to that address from C program? – igntec Sep 05 '16 at 14:31
  • @igntec: no, `_start` is the default entry point for the linker. So the code at `_start` is literally the first instruction that your program runs, no CRT startup code or anything. On Windows, I think you can `ret` from `_start`, because the OS puts `exit` as a return address on the stack. Set a breakpoint at `_start`, run your program, and single-step. It's probably easier to define a `main:` in asm, though. I'm just used to creating bare static binaries on Linux to test trivial things, so I tend to define `_start`. related: http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html. – Peter Cordes Sep 05 '16 at 14:39
  • Thank you for the answer and link. – igntec Sep 05 '16 at 14:52
  • @igntec: There's a whole [x86 tag wiki](http://stackoverflow.com/tags/x86/info) full of useful links that I've collected, if you want to look at more stuff :) – Peter Cordes Sep 05 '16 at 14:58