0

I have this function:

BOOL WINAPI MyFunction(WORD a, WORD b, WORD *c, WORD *d)

When disassembling, I'm getting something like this:

PUSH EBP
MOV ESP, EBP
SUB ESP, C
...
LEAVE
RETN C

As far as I know, the SUB ESP, C means that the function takes 12 bytes for all it's arguments, right? Each argument is 4-byte, and there're 4 arguments so shouldn't this function be disassembled as SUB ESP, 10?

Also, if I don't know about the C header of the function, how can I know the size of each parameter (not the size of all the parameters)?

frogatto
  • 28,539
  • 11
  • 83
  • 129
cdonts
  • 9,304
  • 4
  • 46
  • 72
  • "Also, if I don't know about the C header of the function, how can I know the size of each parameter (not the size of all the parameters)?" - What do you mean? Obviously the function signature and the declaration of each type is required. Otherwise you have a compilation error (and, if you're writing it in assembly, tat rule still applies, just no error. You still need the documentation.) – Ed S. Jan 18 '14 at 23:11
  • I was asking how to know the size of each parameter on a disassembled code, if I don't know anything about the original C function. Sorry if I don't explain that very good. – cdonts Jan 19 '14 at 00:18

3 Answers3

3

No, the SUB instruction only tells you that the function needs 12 bytes for its local variables. Inferring the arguments requires looking at the code that calls this function. You'll see it setting up the stack before the CALL instruction.

In the specific case of a WINAPI function (aka __stdcall), the RET instruction gives you information since that calling convention requires the function to clean-up the stack before it returns. So a RET 0x0C tells you that the arguments required 12 bytes. Otherwise an accidental match with the stack frame size. Which usually means it takes 3 arguments, it depends on the argument types. A WORD size argument gets promoted to a 32-bit value so the signature you theorized is not a match.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
1

If the convention call uses the stack (as it seems) to pass parameters, you can figure out how many parameters and what size they have.

For "how many", you can look at the operand of the RET instruction, if any (stdcall convention). This will give you how many bytes parameters are using. Of course this data alone if of not much use.

You have to read the function code and search for memory references like this [EBP+n] where n is a positive offset from the value of EBP. Positive offsets are addressing parameters, and negative offsets are addressing local variables (created with the SUB ESP,x instruction)

Hopefully, you will able to spot all distinct parameters. If the function has been complied with optimizations, this may be hard to figure out.

For size and type, more inverse engineering is needed. Look at the instructions that use addressed parameters. If you find something like dword ptr [ebp+n] then that parameter is 32-bit long. word ptr [ebp+n] tels you that the parameter is 16-bit long, and byte ptr [ebp+n] means a byte size parameter.

For byte and word sized parameters, the most plausible options are char/unsigned char and short/unsigned short.

For double word sized parameters, type may be int/unsigned int/long/unsigned long, but it may be a pointer as well. To differentiate a pointer from a plain integer, you will have to look further, to see if the dword read from the parameter is being used as a memory address itself to access memory (i.e. it's being dereferenciated).

To tell signedness of a parameter, you have to search for a code fragment in which a particular parameter is compared against some other value, and then a conditional jump is issued. The particular condition used in the jump will tell you if the comparison was performed taking the sign into account or not. For example: a comparison with a JA / JB / JAE / JBE conditional jumps indicate an unsigned comparison and hence, an unsigned parameter. Conditional jumps as JG / JE / JGE / JLE indicate signed parameter involved in the comparison.

mcleod_ideafix
  • 11,128
  • 2
  • 24
  • 32
  • Thanks, well explained! – cdonts Jan 19 '14 at 00:30
  • However it is not clear why the disassembly shows a "RET 0xC". If the C function header above is correct, it should be "RET 0x10" - I just tested that out! – Martin Rosenau Jan 19 '14 at 07:29
  • You should at least mention the existence of calling conventions that always use `ret (no argument)`, instead of having the callee use `ret imm16`. And of course of args in registers. In the general case, just [look at what the function uses without initializing first: those are its inputs](http://stackoverflow.com/questions/37531709/tracing-a-ncr-assembly-program-of-masm/37534836#37534836). – Peter Cordes May 31 '16 at 21:59
0

That depends on your ABI. In your case, it seems you're using Windows x86 (32 bit), which allows several C calling conventions. Some pass parameters in registers, others on the stack. If the parameters are passed on the stack, they will be above the frame pointer, so subtracting from the stack pointer is used to make space for local variables, not to read the function parameters.

EOF
  • 6,273
  • 2
  • 26
  • 50