Trying to convert C to Assembly

Question

I have a struct in C:

struct struct1 {
    uint16_t a;
    uint32_t b;
    char * c;
    struct struct2* d;
}

How can I define the same struct in Nasm? I've tried this:

struc struct1
  .a resw
  .b resdw
  .c ???? ; what should be here?
  .d ???? ; what should be here?   
endstruc

How can I do that?

`.c `, `.d ` (which will depend on the architecture). – David C. Rankin Jan 08 '17 at 08:33 — David C. Rankin, Jan 08 '17 at 08:33

Ped7g · Accepted Answer · 2017-01-08T09:52:00.840

Is this some exam test, or real world C struct?

In real world it may be padded to align it's members, so the .b may be then 4 or 8 (or more, depends on compile time setup of padding), not 2.

When doing C<->asm for real, make sure to use some "padding/packing" pragma or compile time switch to compile always to the same struct structure in C binaries (first step).

Then probably pad/align by hand, for example I would put "a" as last and "c" and "d" at the beginning. So the order in memory would be "c, d, b, a" (which I would find "enough" aligned even for 64b target in "packed" mode, the resulting offsets would be [0, 8, 16, 20] and size would be 22 bytes) (edit: and I would add another word at end just to pad it to 24B size, if I would know I will use many of them in array).

Finally what is c and d in memory -> pointers. By "nasm" word usage I sense x86 target platform, and by "uint32_t" I sense it will be not 16b real mode, so they are either 32 or 64 bits (depends on your target platform). 32 bits is 4 bytes, 64 bits is 8 bytes.

BTW, you can always write some short C source exercising access to the struct, and check the output of compiler.

For example I put this into http://godbolt.org/:

#include <cstdint>

struct struct1 {
    uint16_t a;
    uint32_t b;
    char * c;
    void * d;
};

std::size_t testFunction(struct1 *in) {
    std::size_t r = in->a;
    r += in->b;
    r += uintptr_t(in->c);
    r += uintptr_t(in->d);
    return r;
}

And got this out (clang 3.9.0 -O3 -m32 -std=c++11):

testFunction(struct1*):              # @testFunction(struct1*)
        mov     ecx, dword ptr [esp + 4]   ; ecx = "in" pointer
        movzx   eax, word ptr [ecx]        ; +0 for "a"
        add     eax, dword ptr [ecx + 4]   ; +4 for "b"
        add     eax, dword ptr [ecx + 8]   ; +8 for "c"
        add     eax, dword ptr [ecx + 12]  ; +12 for "d"
        ret     ; size of struct is 16B

And with 64b target:

testFunction(struct1*):              # @testFunction(struct1*)
        mov     rax, qword ptr [rdi]
        movzx   ecx, ax
        shr     rax, 32
        add     rax, rcx
        add     rax, qword ptr [rdi + 8]
        add     rax, qword ptr [rdi + 16]
        ret

The offsets are now 0, 4, 8 and 16, and size is 24B.

And 64b target with added "-fpack-struct=1":

testFunction(struct1*):              # @testFunction(struct1*)
        movzx   ecx, word ptr [rdi]
        mov     eax, dword ptr [rdi + 2]
        add     rax, rcx
        add     rax, qword ptr [rdi + 6]
        add     rax, qword ptr [rdi + 14]
        ret

Offsets are 0, 2, 6 and 14 and size is 22B (and performance will be hurt by unaligned access to members b, c and d).

So for example for the 0, 4, 8, 16 case (64b aligned) your NASM struct should be:

struc struct1
  .a resd 1
  .b resd 1
  .c resq 1
  .d resq 1
endstruc

From your further comments... I think you maybe sort of miss what is "struc" in assembly. It's a gimmick, it's just other way of specifying address offsets. The example above can be written also as:

struc struct1
  .a resw 1
     resw 1   ; padding to make "b" start at offset 4
  .b resd 1
  .c resq 1
  .d resq 1
endstruc

Now you have your "resw" for "a" too. It doesn't matter for the ASM, as for the code only the value of symbols .a and .b is important, and those values are 0 and 4 in both examples. It doesn't matter how you reserve the space inside the struc definition, it doesn't affect the result, as long as you specify the correct amount of bytes for particular "variable" + its padding.

@ako25 first you must know, what the struct **is**. As you can see, it can change a lot depending on how you compile it (target + packing). Then simply reserve enough bytes, I added also "nasm" example, but that looks to me like the trivial part, check http://www.nasm.us/doc/nasmdoc3.html for the various `resX` directives, and actually you can do with `resb` only.. like `resb 8` for pointers. — Ped7g, Jan 08 '17 at 09:16
@ako25 Because why not? C++ compilers do this to improve performance and they are not required to produce word for word variable (unless you enforce it during compilation). http://stackoverflow.com/a/5398498/4271923 — Ped7g, Jan 08 '17 at 09:18
@ako25: In other words, your original question does not say enough information, how the `struct1` is being compiled. In my answer you have three different binary outputs from compiler, depending on target platform and packing setting. If you will improve your question to show how it is compiled from C, then we can show you how to define it in NASM. Until then, this generic answer is best I could do for you. — Ped7g, Jan 08 '17 at 09:21
*"I think you maybe sort of miss what is "struc" in assembly. It's a gimmick, it's just other way of specifying address offsets."* Yes, this is the real answer. — Cody Gray - on strike, Jan 08 '17 at 09:34
@ako25 oh, you are the one who deemed my comment about large asm projects as "incorrect", cute. :) Anyway, I think I'm finished with this answer for the moment (did add struct sizes into comments). If it's still not clear for you, try to ask what is puzzling you, but please try to be descriptive enough, your asm skills looks to be not on par with mine, and thus it is difficult for me to guess what ASM jargon I can use to not confuse you too much and explain the concept clearly enough. Then again from certain point it would be probably appropriate from you to head into some ASM tutorial first. — Ped7g, Jan 08 '17 at 09:49

Trying to convert C to Assembly

1 Answers1