1

I'm currently reverse engineering a game and I've come across an issue where I need to call GetRawInputData, which expects pcbSize as one of its arguments.

Normally in C I would just write sizeof(pData) but I have no idea how to go about this in machine code.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847

3 Answers3

3

sizeof is purely a construct of the C type system, and is completely resolved at compile time to a plain number; there's no such a thing in machine code, you'll probably just find an immediate value in a push or mov corresponding to the size of pData.

For example, in a program of ours, the sequence

RAWINPUT raw;
UINT dwSize = sizeof(raw);
GetRawInputData((HRAWINPUT)lparam, RID_INPUT, &raw, &dwSize, sizeof(RAWINPUTHEADER));

gets translated by gcc 4.8 as

0x005f351d <+125>:   lea    eax,[ebp-0x48]                   // eax = &dwSize
0x005f3520 <+128>:   mov    DWORD PTR [esp+0xc],eax          // pcbSize = eax = &dwSize
0x005f3524 <+132>:   lea    eax,[ebp-0x38]                   // eax = &raw
0x005f3527 <+135>:   mov    DWORD PTR [ebp-0x48],0x28        // dwSize = sizeof(raw) i.e. 38
0x005f352e <+142>:   mov    DWORD PTR [esp+0x10],0x10        // cbSizeHeader = sizeof(RAWINPUTHEADER) i.e. 16
0x005f3536 <+150>:   mov    DWORD PTR [esp+0x8],eax          // pdata = eax = &raw
0x005f353a <+154>:   mov    DWORD PTR [esp+0x4],0x10000003   // uiCommand = RID_INPUT
0x005f3542 <+162>:   mov    DWORD PTR [esp],ecx              // hRawInput = lparam
0x005f3545 <+165>:   call   DWORD PTR ds:0x20967fc           // call GetRawInputData
Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • If that's true how does sizeof go about returning a size for data that might have its size change during runtime? – bcvdgfdag fewafdsaf Sep 13 '18 at 08:43
  • 1
    If it's a regular C array it doesn't change at runtime. If instead it's a VLA, that's an entire different situation. – Matteo Italia Sep 13 '18 at 08:44
  • So then pdata's size will always be 13? I found that near a call to GetRawInputData but the MSDN documentation makes it sound like the value will change. Why does it even have that as an argument if it's always going to be 13? – bcvdgfdag fewafdsaf Sep 13 '18 at 08:46
  • 2
    @bcvdgfdagfewafdsaf: because they want to have leeway to change the structure size/layout in future. Then on their side they can check the size that has been passed by the program - if it's the old size, they know it's the old version of the structure and fill it in accordingly, if it's the new size they populate it with all the new info. This is a common pattern throughout the Win32 APIs (although generally you'll find a `cbSize` member straight in the structure or something like that). BTW, I find it unlikely that you found 13 as size - `RAWINPUT` is way bigger (and 13 is not aligned enough) – Matteo Italia Sep 13 '18 at 08:53
  • you're right, I was looking at the wrong argument in my disassembler. – bcvdgfdag fewafdsaf Sep 13 '18 at 09:01
2

There is no any equivalent. sizeof is compile time construct it is translated to just a number in assembly. I.e. sizeof(pcbSize) will be something like 48 or 1024 or so on. You have to compute the size manually or find it in disassembled code if you need it.

dev_null
  • 1,907
  • 1
  • 16
  • 25
0

In assembly source, you can have the assembler calculate assemble-time constants like

msg: db "hello world", 10            ; 10 = ASCII newline
msglen equ $-msg

Then when you write mov edx, msglen, it assembles to mov edx, imm32 with the constant substituted in. See How does $ work in NASM, exactly? for some examples.

But in the final machine-code, assemble-time constants have all become immediates or data constants. (e.g. ptr_and_length: dq msg, msglen in the data or rodata section assembles into an address and a qword integer which is just there in the object file, not calculated at runtime from anything.)


(Assemble-time constants can also be used as repeat-counts in macros or other directives. (e.g.
times power imul eax, ecx assembles to a block of that many imul instructions. power is an integer constant defined with EQU. Or NASM %rep n / ... / %endrep)

Or used in assemble-time expressions, so the size itself isn't literally present in the object file, just the result of some calculation based on it. (e.g. mov edx, msglen+2 or mov ecx, arrbytes/4, the latter maybe as a bound for a loop that counts by dwords instead of bytes).

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847