Instruction Decoding in x86 architecture

Question

I am working on a operating system project for my lab where I've to work with the instruction pointer and instruction opcode. Right now all I need to know is what type of instruction it is. For that I'm reading the data from the address pointed by instruction pointer. The first byte from this data gives me the instruction type. For example if first byte is 0xC6 it is a MOVB instruction. Now there are some cases when the first byte of instruction pointer is 0x0F. According to documentation 0x0F which means it is a two byte instruction. My problem is with this type of instruction. I'm not sure how to find out the instruction type for two byte instruction.

After that my 2nd priority is two find out the operands of the instruction. I've no knowledge of doing that from code. Any sample code will be appreciated

Third comes the need to find out the size of the instruction. As x86 is variable length, I want to know the size of each instructions. At first I planned to use a look up table where I'll maintain the instruction name and its size. But then I discovered that the same instruction can have variable length. For example when I used object dump on a .o file I found two instruction C6 00 62 which is for MOVB $0x62,(%EAX) & C6 85 2C FF FF FF 00 which is for MOVB $0x0,-0xD4(%EBP). Look here both instruction type is same(C6) but the are of different length.

So I'm in need of answers to those questions. It'll be highly appreciated if someone can give me some solutions.

What you are asking is quite complicated, and not easy to answer in the SO format. You can check how popular open-source disassemblers do it (`gdb`, `objdump`). And [read the docs](http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-2a-manual.html)! — rodrigo, Dec 02 '13 at 02:12
More context, please. Why does an OS project involve emulating an x86 CPU? — Seva Alekseyev, Dec 02 '13 at 04:32
@SevaAlekseyev My project involves enabling a process to backtrack to its earlier state. I am supposed to do it using mprotect(). The idea is to protect every page from write operation. So when there is a write operation, I'll catch it. Inside the handler function, I'll save the old data, extract the new data from the instruction, make the page unprotected, run the instruction again and then make the page protected again. Since inside the handler I don't know what value the operation was going to write, so all I can do is find out the IP, get the instruction operands and execute it again. — azizulhakim, Dec 02 '13 at 05:48
There might be other technique like binary instrumentation. But I've been so far with mprotect() and don't have enough time to consider new solution. — azizulhakim, Dec 02 '13 at 05:49
You should read Chapter 2 and the related appendixes of the Intel 64 and IA32 Architectures Software Developer's Manual, Volume 2. In addition (or alternatively), the AMD64 Architecture Programmer's Manual, Volume 3 has similar information. As Ira Baxter mentions, the ModR/M and SIB bytes will be important to your instruction decoding, as will the displacement and immediate bytes (which also vary in size). — Michael Burr, Dec 02 '13 at 06:11
You might find the sources to BeaEngine or the recently release Ollydbg 2 disassm engine of interest; http://www.beaengine.org/ http://www.ollydbg.de/Disasm201.zip — Necrolis, Dec 02 '13 at 07:50
Why not use the T flag and debug infrastructure instead? That's how you single-step on x86. — Seva Alekseyev, Dec 02 '13 at 14:59

score 9 · Accepted Answer · edited May 23 '17 at 12:15

9

Basically what you need is set of nested case statements, implementing a finite state machine scanner, where each level inspects some byte (typically left to right) of the opcode to determine what it does.

Your top level case statement will pretty much be 256 cases, one for each opcode byte; you'll find some of the opcodes (especially the so-called "prefix" bytes) cause the top level to loop (picking up multiple prefix bytes the precede main opcode byte). Sub cases will acquire structure according the opcode structure of the x86; you'll almost certainly end up with a MODRM and SIB addressing mode byte decoders/subroutines.

I've done this; the work is annoying because of details but not hard. You can get a pretty good solution in several hundred lines of code if you are careful. If you insist on doing the whole instruction set (vector registers and opcodes, esp. for haswell etc.) you're likely to end up with something bigger; Intel has been jamming instructions into every dark corner they can find.

You really need an opcode map; I'm pretty sure there is one in the Intel manuals. I've found this link to be pretty useful: http://www.ref.x86asm.net/coder32.html

EDIT Sept 2015: Here at SO I provide C code that implements this: https://stackoverflow.com/a/23843450/120163

edited May 23 '17 at 12:15

Community

1
1

answered Dec 02 '13 at 02:06

Ira Baxter

93,541
22
172
341

Giant switch statements are just evil and can seldom be excused for. :) – oakad Dec 02 '13 at 04:15
6

@user16653: Sometimes evil is just the right thing :) – Ira Baxter Dec 02 '13 at 05:51
Yes, actually I tried to have a look up table, but I'm having problem with TWOBYTE instructions. I don't know how they work. For example, I'm getting value 0x0000000F in instruction pointer and I really don't know what it means. It doesn't fill any category – azizulhakim Dec 02 '13 at 05:53
1

0x0F in the PC hasn't got anything to do with instruction decoding. It means the program jumped to 0x0F. Typically, an OS will make the first and last pages of the VM simply illegal, because this catches many stupid mistakes in assembly code moving small positive or negative constants into the PC. There's no way to back up a JMP into such a page; the user program has simply made a tremendous error and your OS should simply shut it down. – Ira Baxter Dec 02 '13 at 06:48
@IraBaxter Thanks for the answer. Are you certain of it? – azizulhakim Dec 02 '13 at 07:16
@IraBaxter Also to clarify, the instruction pointer is not pointing to adddress 0x0F. Rather the value of instruction pointer is 0x0F – azizulhakim Dec 02 '13 at 08:21
@azizulfahim: When you say "the value of the instruction pointer is 0x0f", I understand that to mean "EIP == 0x0000000F". Perhaps you are trying to tell me that "*(dword)EIP == 0x0000000F"? The x86 being little endian, that implies that "*(byte)EIP == 0x0F" so the opcode you are trying to look up is "0x0F". – Ira Baxter Dec 02 '13 at 11:09
2

... EIP-> 0F 00 .. The 0F is a LOCK or REPNE prefix byte. The whole thing looks sort of like "LOCK ADD modrm..." I didn't decode the modrm part; you get to do that. – Ira Baxter Dec 02 '13 at 16:02

oakad · Answer 2 · 2013-12-02T04:19:53.997

5

Additional approach is to actually build a proper parser for the assembly, using one of the many parser generator frameworks (such as an ubiquitous yacc). This may result in easier to maintain and mode readable implementation than using nested switch statements with large amount of cases.

There's also an intermediate approach, whereupon table based parser can be implemented "by hand". One example is here: https://github.com/libcpu/libcpu/blob/master/arch/x86/x86_decode.cpp

edited Dec 02 '13 at 04:19

answered Dec 02 '13 at 04:14

oakad

6,945
1
22
31

You are suggesting using a parser generator to process the instruction byte streams, with instruction bytes as tokens? That's interesting. But I'm pretty sure that the real CPUs don't use that kind of power; they're running from at best FSAs. Closer to the mark might be to use a lexer generator, which produces FSAs. – Ira Baxter Dec 02 '13 at 15:01
1

Actually the nice decoder implementation in my link seems to be the most straightforward way to do the decoder on FPGA at least (using a LUT block and a simple FSA). But when software is concerned, I'm probably spoilt by boost::spirit way to do parsers (which double as FSA's and do just fine, complexity wise, without lexing step, resulting in very straightforward and efficient software structure). Yet, judging from the fact, that boost::spirit can appear somewhat intimidating at the first glance, I decided to mention something more mundane, like yacc. – oakad Dec 03 '13 at 00:47

score 1 · Answer 3 · answered Dec 02 '13 at 04:29

1

kvm has a very sophisticated x86 emulator / decoder that may be reusable by your project.

answered Dec 02 '13 at 04:29

liuyu

1,279
11
25

In fact there are a lot of tools. But their source is too big for me right now. All I want is to understand the three points I made. From there I believe I'll be able to make a simple one. – azizulhakim Dec 02 '13 at 05:57
1

@azizulfahim, In your question, the CPU state has been reduced to a single instruction pointer. This is over-simplification of the problem in the first place. An x86 instruction decoder is, in its nature, a DFA taking binary strings as input. Its transition table is determined by the grammar. You cannot simplify the grammar and hoping the DFA still correctly handles binary strings from the full grammar. – liuyu Dec 02 '13 at 07:28

Instruction Decoding in x86 architecture

3 Answers3

Linked