using offsetof in assembly / C

Question

I am trying to use offsetof in my assembly code

#define     offsetof(TYPE, MEMBER)   ((size_t) &((TYPE *)0)->MEMBER)

#define     DEFINE(sym, val)   asm volatile("\n->" #sym " %0 " #val : : "i" (val))

and say a structure is

struct mystruct {
int a;
int b;
int c;

}

In my assembly code I have to simply do SUB sp, sp, #-

How to declare the macro

Note that `#include ` already provides `offsetof` in ISO C. (https://en.cppreference.com/w/c/types/offsetof) — Peter Cordes, Apr 30 '21 at 03:54

jorgbrown · Answer 1 · 2021-04-30T04:42:14.687

You haven't said which processor you're generating code for... I assume it's a RISC machine because your subtract instruction has 3 operands, but you haven't said which one. I'm going to show you what this looks for x86 because I know the answer is correct.

I also assume by your use of asm volatile that you're using a compiler that follows gcc's standard.

Anyway, let's say you had a struct like this:

struct mystruct {
  unsigned char a;
  const char *b;
  int c;
};

int a_or_c(mystruct *str) {
  int a_val = str->a;
  return a_val & 1 ? str->c : (a_val >> 1);
}

The compiler doesn't generate particularly good code for that - it's seven instructions and we can do much better because we know that one instruction can do both the test for "& 1" and the shift right.

To specify a register, you use r for "register", as I'm sure you know. To specify an in/out register you use +r. To specify a constant like a struct offset, you use i for "immediate". Then the assembler syntax would be, for example:

int a_or_c_asm(mystruct *str) {
  int a_val = str->a;
  asm("shr $1,%0\n\t"
      "cmovcl %c2(%1), %0"
      : "+r"(a_val)
      : "r"(str), "i"(offsetof(mystruct, c))
      : "memory"      // tell the compiler that our code reads from memory
     );

  return a_val;
}

The trick here is you have to use %c2 rather than merely %2 to get the inline assembler to output a 2 rather than $2, because the x86 assembler uses a different syntax for offsets in addressing modes, than it does for immediate operands. A subtract instruction in x86 would look like this, for example:

  asm("subq %0, %%rsp"
      "... other instructions ..."
      : // no output operands
      : "i"(offsetof(mystruct, c)));
  // expands to   subq $16, %rsp   for x86-64

By your comment I assume you need ARM32 syntax. For that, your subtract instruction would look like this:

  asm("sub sp, %0"
      "... other instructions ..."
      : // no output operands
      : "i"(offsetof(mystruct, c)));
  // expands to  sub sp, #8    for ARM

(Obligatory Godbolt: compiles and assembles correctly for ARM.)
(Obligatory Godbolt: compiles and assembles correctly for x86.)

Please note that gcc assumes the stack pointer is the same at the end of any assembly block as it is at the start; the rest of your assembly block will have to include an instruction to restore sp to its original value.

For ARM32, my example would look like this - note the different syntax for using a register-plus-offset addressing mode in ARM32:

int a_or_c_asm(mystruct *str) {
  int a_val = str->a;
  asm("lsrs %0, #1\t\n"       // shift and set flags
      "ldrcs %0, [%1, %2]"    // load (predicated on Carry Set)
  : "+r"(a_val)
  : "r"(str), "i"(offsetof(mystruct, c))
  : "memory"  // we access memory that isn't a declared input.
  );
  return a_val;
}

Of course, this is all contrived to show how to pass a known constant value to gcc's inline asm syntax. A more common alternative would be to have a "m"(str->c) input operand that you use for ldrcs %0, %1., so that you don't have to use the offsetof macro.

Also, rather than using "memory", you could pass a dummy input operand to tell the compiler that the c field is an input, but you still actually form the addressing mode yourself; for more on this see How can I indicate that the memory *pointed* to by an inline ASM argument may be used?

Your asm statements are missing anything to [tell the compiler that the *pointed-to memory* is also an input.](https://stackoverflow.com/q/56432259). The optimizer can assume that a store to `str->c` is independent of your asm statement if you omit a `"memory"` clobber or a dummy input like `"m"(str->c)`. (At which point you might as well just use the `"m"` input instead of re-inventing your own addressing mode for it, unless you have some other use for the offsets and/or want a less-specific dummy input of the whole struct, not just that member.) — Peter Cordes, Apr 30 '21 at 03:24
Anyway, please make your examples safe, even if that makes the `offsetof()` redundant in this case. GNU C inline asm is hard enough for people to learn without unsafe examples floating around. I'll add "memory" clobbers to all the ones that need it, feel free to edit to use dummy inputs. — Peter Cordes, Apr 30 '21 at 03:26
Note that `int a_or_c(mystruct *str)` is only valid in C++, unless you use `typedef struct { ...} mystruct;`. This is a C question, there's a reason I added `struct` to the typename everywhere. (Including in [offsetof](https://en.cppreference.com/w/c/types/offsetof)). Otherwise good edits. (I didn't realize Godbolt had added a binary mode for non-x86 compilers; that's nice so you actually can verify that it assembles as well as compiles.) — Peter Cordes, Apr 30 '21 at 07:06

score 0 · Answer 2 · answered Oct 08 '13 at 10:23

I don't think this can work.

The offsetof operator is a compile-time thing, it's not evaluated by the preprocessor. That would be almost magical, since the preprocessor doesn't parse C, how could it compute structure offsets? Doing that requires a lot of machine-specific information, and is thus heavily into the compiler's area of responsibility. The preprocessor just massages text.

While typical documentation calls offsetof a macro, that doesn't mean it's evaluated by the preprocessor. It could just mean that it's a macro that evaluates into some compiler-specific magic.

For instance for gcc it can be defined like so:

#define offsetof(type, member)  __builtin_offsetof (type, member)

Here, __builtin_offsetof() is the magical compiler-specific function that really does the computation. Leaving a call to it where your assembler source needs a literal offset is of course not a solution.

check UREGS_pc in the link http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/arm/arm32/entry.S;h=774e7c6766e2d1838f0b06948f8dc8e9b1617c7e;hb=HEAD — , Oct 08 '13 at 14:24

using offsetof in assembly / C

2 Answers2