2

The first 2 bytes of a DOS executable are 0x4d and 0x5a. If these are executed, 0x4d implies 'dec ebp' and 0x5a is 'pop edx'.

'dec ebp' decrements the base pointer by 1 and 'pop edx' increments the value of esp by 4 (x86 assembly). My question is that won't these operations leave the stack in an inconsistent state? And since the command line arguments (if any) are stored relative to ebp, won't these operations make the command line arguments inaccessible?

I may be missing something obvious, if so please humour me...

Tarun R
  • 33
  • 2
  • 1
    The DOS program loader parses the header (doesn't execute it) to determine what type of program it is, the size of the progra, on disk, the amount of memory needed, relocation table, entry point etc. Once the program is loaded and set up in memory control is transferred to the entry point in the code. – Michael Petch Dec 16 '19 at 12:25
  • 1
    This probably made [Mark Zbikowski](https://en.wikipedia.org/wiki/Mark_Zbikowski) giggle a little :) – Margaret Bloom Dec 16 '19 at 12:49

1 Answers1

7

Unlike COM-type executables - where execution starts at the first byte of the program image - EXE-type executables are no supposed to start with executable code. At the beginning of an EXE file there is a header block instead, and this contains the address of the actual program entry point, among other things.

Hence the bytes 'MZ' (or - supposedly equally valid - 'ZM') do not represent instructions. They are simply markers for identifying the format.

There is a good overview in the wikipedia article DOS MZ executable.

Note: the DOS parts of executables are implicitly 16-bit real mode and should be disassembled as such, not as 32-bit code.

DarthGizka
  • 4,347
  • 1
  • 24
  • 36
  • A DOS program can have a mixture of 16 and 32-bit code, just can't use 32-bit relocations. – Michael Petch Dec 16 '19 at 12:53
  • @ Michael: address and operand size prefixes do not change the nature of the code (and the mode to which the disassembler needs to be set). If you perform a mode switch - which includes fiddling with the segment settings via `LOADALL` - then whatever comes after could by anything you want and it wouldn't be strictly speaking a DOS part anymore. The code at external DOS entry points must in any case play fully by the 16-bit rules.. – DarthGizka Dec 16 '19 at 13:25
  • see explanation in [Operand size prefix in 16-bit mode](https://stackoverflow.com/a/14660027/4156577) – DarthGizka Dec 16 '19 at 13:31
  • The initial code at entry is 16-bit but nothing prevents the code from entering protected mode and using 32-bit code, thus an EXE can be a mixture of 16, 32 or even 64-bit code LOADALL hasn't been available for ring 0,1,2,3 usage since the 486 (It could only be used in SMM mode on a 486 and was deprecated after the 486). Getting into protected mode requires setting up a GDT with 32-bit code segment, enable PM flag in CR0, doing a FAR JMP )or equivalent) with a 32-bit CS selector (and setting up the other segments appropriately) – Michael Petch Dec 16 '19 at 13:36
  • This code (A Stackoverflow answer I wrote some time back): https://stackoverflow.com/a/54779187/3857942 enters 32-bit protected mode and contains a mixture of 16-bit and 32-bit code. If you were to disassemble that EXE entirely as 16-bit you'd discover the disassembly wouldn't look correct for the code assembled as 32-bit instructions. – Michael Petch Dec 16 '19 at 13:40