-2

Say, I've defined a struct already in Assembly. How can I do this?

    struct some_struct_t *s1 = (struct some_struct_t *)some_buffer;

where

char some_buffer[1024];
malloc = malloc(1024);
memset(some_buffer, 0, 1024);

I can think of lea or mov, but how exactly?

Jojika
  • 7
  • 3
  • 4
    asm doesn't care about types, it's your job to use the proper instructions appropriate for your type. In case of structs that also includes using the correct offsets to the members. – Jester Feb 04 '17 at 14:14
  • @Jester, can you show me an example? – Jojika Feb 04 '17 at 14:48
  • 1
    An example for accessing a struct member? `mov eax, [edx+8]` if you have your pointer already in `edx` and you want to load a member of size 4 starting at offset 8. – Jester Feb 04 '17 at 14:52
  • @Jester, no, the example I'll use to get an answer for my question – Jojika Feb 04 '17 at 15:16
  • 1
    He just did. Why don't you take some C code with a structure, compile it to assembly and have a look at it. – David Hoelzer Feb 04 '17 at 16:17
  • @DavidHoelzer, no, he didn't. – Jojika Feb 04 '17 at 17:44
  • 4
    You're failure to understand doesn't mean that you weren't provided with an accurate answer. assembly doesn't type data and accessing structure elements, which is an artificial high level construct, is accomplished by offsetting from the start of the construct by the size of the intervening elements. – David Hoelzer Feb 04 '17 at 18:23

1 Answers1

3

Consider this C code (with implementation-defined behaviour)

void foo()
{
    struct struct1* s1 = (struct struct1*)0x01234567;  /* exp1 */
    struct struct2* s2 = (struct struct2*)0x01234567;  /* exp2 */
    struct struct3* s3 = (struct struct3*)0x01234567;  /* exp3 */
    float* f = (float*)0x01234567;                     /* exp4 */
    int* i = (int*)0x01234567;                         /* exp5 */
    char* c = (char*)0x01234567;                       /* exp6 */
}

Assuming s1 is at [rsp-08h] for the sake of clarity, then exp1 is assembled as

mov QWORD [rsp-08h], 1234567h

Assuming s2 is at [rsp-10h] for the sake of clarity, then exp2 is assembled as

mov QWORD [rsp-10h], 1234567h

Assuming s3 is at [rsp-18h] for the sake of clarity, then exp3 is assembled as

mov QWORD [rsp-18h], 1234567h

Assuming f is at [rsp-20h] for the sake of clarity, then exp4 is assembled as

mov QWORD [rsp-20h], 1234567h

Assuming i is at [rsp-28h] for the sake of clarity, then exp4 is assembled as

... haven't you got it yet?


There is no such thing as a type in assembly and consequently no such thing as a cast.
There is only data in assembly, that's why we invented typed high-level languages, not for the ifs nor for the fors but for the type checking.

If you want to do struct some_struct_t *s1 = (struct some_struct_t *)some_buffer; then that translates as s1 = some_buffer.
That's just an assignment of values.


Now since some_buffer is an array with automatic storage, and this translates as "it is on the stack" on x86, you may wonder what exactly is the semantic of struct some_struct_t *s1 = (struct some_struct_t *)some_buffer; besides the (artificial) cast that, as you now know, only lives for the duration of the compilation process.

You surely know that some_buffer decay into a pointer to the first element, then the only hard thing to do, in translating that instruction, is figuring out the address of the first element.

Well, I can't tell you very much about this because I don't know where you placed the first item but in general some_buffer is on the stack so this, once adjusted, will do

;Compute the address of the first element
lea rax, [rsp+...]         ;or [rbp-...] if a frame pointer is available

;Store it in a local var
mov QWORD [rsp+...], rax   ;As above, also you can use any other scratch reg

Where the first ellipsis is used in place of the offset, relative to rsp of the first item of the array.
The second one is used in place of the offset of the s1 pointer.

For a live example see here.
Note that in that example GCC is being overly zealous in using the Red zone, but we can forgive him as I had to disable any optimisation in order to have a sensible disassembly.


If you wonder how you can do the same for malloc then, if you still don't want to use godbolt.org, the solution is here

mov edi, 1024
call malloc
mov QWORD [rsp+...], rax
Margaret Bloom
  • 41,768
  • 5
  • 78
  • 124
  • thanks. 1) is it qword in `mov QWORD [rsp-08h], 1234567h` and others because it's fo x64 system? for x32 it'll be `mov DWORD [rsp-08h], 1234567h`? 2) shouldn't `mov QWORD [rsp-08h], 1234567h` be without qword or anything else in nasm? – Jojika Feb 05 '17 at 01:02
  • You can't use `rsp` in x86 since it's a 64 bit register/. You use `esp`, which is a 32 bit register. – David Hoelzer Feb 05 '17 at 01:36
  • Just an added note: You can disabled the red-zone behavior. – David Hoelzer Feb 05 '17 at 01:37
  • @Jojika You can omit `QWORD` in this case. I used it just to make memory access more visible, it's just style. – Margaret Bloom Feb 05 '17 at 06:56