13

What is the size of each asm instruction? Every instruction takes how many bytes? 8 bytes? Four for the opcode and Four for the argument? What happens when you have one opcode and 2 arguments, in mov, for example? Do they have a fixed size in memory or do they vary? Does EIP have anything to do with this, of its value is always incremented by one, being totally independent of what kind of instruction it is passing by?

I ask this as when I was reading http://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames , I stumbled across the fact that it seems a call instruction is equivalent to a push and jmp instruction.

call MYFUNCTION
mov my_var, eax

being the same as

push [eip + 2];
jmp MYFUNCTION;
mov my_var, eax

When we're pushing [eip + 2] on the stack, to what value are we pointing then? To the line right next to "jmp MYFUNCTION", mov my_var eax, right?

ps: MSVC++ flags an error on the first line, as it says eip is undefined. It works for eax, esp, ebp, etc. What am I doing wrong?

starblue
  • 55,348
  • 14
  • 97
  • 151
devoured elysium
  • 101,373
  • 131
  • 340
  • 557
  • This question has nothing to do with c++, so tag removed. – Kirill V. Lyadvinsky Sep 09 '09 at 14:07
  • It should be push eip + 2, not push [eip + 2]. – Daniel Brückner Sep 09 '09 at 14:32
  • It must be "push eip + 2", but this is not valid - call is theoretically equivalent to this, but you cannot do this simple transformation. – Daniel Brückner Sep 09 '09 at 15:44
  • I know it is push eip + 2, but what I'm saying is that both push eip and push eip + 2 don't work. If I got what you said, you mean we can't push eip on the stack? From what I see we can't even copy its value to eax, for example. What is the reason? – devoured elysium Sep 09 '09 at 15:47
  • 1
    You can (see http://coding.derkeiler.com/Archive/Assembler/comp.lang.asm.x86/2009-03/msg00135.html for examples) but I don't know the reason for the design decision to have no mov EAX, EIP instruction or something similar. – Daniel Brückner Sep 09 '09 at 16:20
  • 2
    When you define instructions that read IP, you then have to define exactly what the IP's value will be when every one of those instructions is executed (the first opcode byte? the first byte of next instruction? Something in between, in a more complex instruction?) and then be prepared to live with that definition forever, for future compatibilty. Bear in mind that the *actual* IP isn't well defined in this respect, due to pipelining. For this reason, wise architects eschew such capabilities in the instruction set, except for specific purposes (e.g. call, and PC-relative addressing). – greggo Dec 19 '12 at 19:50

2 Answers2

18

The size of a machine instruction depends on the processor architecture - there are architectures with fixed size instruction, but you are obviously refering to the IA-32 and Intel 64 and they have strongly varing instruction lengths. The instruction pointer is of course always incremented by the length of the processed instruction.

You can download the IA-32 and Intel 64 manuals from Intel - they contain almost everything you can know about the architecture. You can find an opcode map and instruction set format in Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2B: Instruction Set Reference, N-Z on pages 623 to 768.

Daniel Brückner
  • 59,031
  • 16
  • 99
  • 143
  • 1
    If each instruction has a different length, how can we, or even the computer, know in memory what is an instruction and what is an argument? If every instruction had the same size, the computer would know that, let's say, each 4 bytes corresponded to a new instruction. But if their size varies, that is not true. At the end (or beginning!) of each instruction is there any special byte or something? Thanks – devoured elysium Sep 09 '09 at 14:08
  • 9
    Absolutly. The first byte includes enough information to determine if a second byte must be read and the second byte allows determining if the instruction has a third byte and so on. Just have a look at the instruction code tables. – Daniel Brückner Sep 09 '09 at 14:10
  • 3
    @devoured elysium, the are *special* bytes (opcodes) that indicate extended instructions. – Nick Dandoulakis Sep 09 '09 at 14:10
  • 3
    Incidentally IA-64 is actually Itanium; you may be thinking of the x86-64/x64/amd64/em64t extensions to IA-32. IE-64 is a fixed instruction width architecture (and a honking big one at 128 bits per group-of-instructions). – bobince Sep 09 '09 at 14:35
  • Thanks, I always write IA-64 instead of Intel 64. Fixed it. – Daniel Brückner Sep 09 '09 at 14:46
2

Machine code size depends on the processor's architecture.

For example, on IA-32, the instruction size varies from 1 to 6 bytes (or more).

Nick Dandoulakis
  • 42,588
  • 16
  • 104
  • 136