5

I created a very simple assembly program that prints the letter 'a' in DOS. I opened it in a hex editor and the result was this:

Assembly code:

mov ah, 2 
mov dx, 'a' 
int 21h 

Hex code

B4 02 B2 61 CD 21

I wanted to understand how it was generated! Like, I do not know if I'm right, but I realized that:

B4 = mov ah 
02 = 2 
B2 = mov dx 
61 = 'a' 
CD = int 
21h = 21

The 02, 61 and 21 I understood what turned but and B4, B2 and CD?

user3500017
  • 187
  • 2
  • 4
  • 10
  • 2
    The official docs: http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html?iid=tech_vt_tech+64-32_manuals If you are interested only in "instruction opcodes", the bytes into which different instructions are encoded then on the previously linked site start with this document: "Manual Volume 2A: Instruction Set Reference, A-M" The site linked by an answer below (http://ref.x86asm.net/coder32.html) is a very good summary/overview of the intel PDFs but read the intel docs to learn the exact behavior of instructions. – pasztorpisti Apr 05 '14 at 20:59
  • BTW, if you are thinking about playing around with assembly and maybe with disassembling/reverse engineering then try the best disassembler/debugger of all times, it has a free version (5.0) that knows much less then the newer versions but even this old free version can kick the ass of any other solution: https://www.hex-rays.com/products/ida/support/download_freeware.shtml It can come handy in analyzing stuff. – pasztorpisti Apr 05 '14 at 21:08
  • `B4 02 B2 61 CD 21` is a mixture of `opcode` `string` etc, so how can we tell if the hex code is `opcode` or `string`? etc I am confused, is there any indicators? – yeln Aug 23 '22 at 18:52

2 Answers2

7

Here's a nice reference: http://ref.x86asm.net/coder32.html

As you can see:

  • CD is the opcode for int
  • B0+reg is the opcode for mov reg, imm8, where reg is the destination register and as you can see from this table, ah = 100b and dx = 010b
pNre
  • 5,376
  • 2
  • 22
  • 27
  • since there are 16 general purpose registers, it would require 4 bits to encode them.. isn't it? – Sourav Kannantha B Apr 01 '21 at 11:47
  • in the reference you provided, they have given opcodes and equivalent hex values. But I am not understanding how to combine them to obtain valid instruction. For eg, 0x817 corresponds to cmp, and 0x5 corresponds to reg ebp. But with this information, how can I encode `cmp dword [ebp-4] 2`. I have searching through internet for this for around 2 hrs. Help me out!! – Sourav Kannantha B Apr 01 '21 at 12:02
  • @SouravKannanthaB hi, how u solve it? – yeln Aug 24 '22 at 03:30
2

Are Assembly x86 instructions:

  • B4: mov ah mean move in the register ah
  • B2: mov dx mean move in the register dx
  • CD: int means software interrupt

I recommend you read this guide assembly x86 http://www.cs.virginia.edu/~evans/cs216/guides/x86.html

invictus1306
  • 587
  • 1
  • 4
  • 19