23

Given a struct, for instance:

struct A {
    char a;
    char b;
} __attribute__((packed));

I want the offset of b (in this example, 1) in the struct to be printed at compile time - I don't want to have to run the program and call something like printf("%zu", offsetof(struct A, b)); because printing is non-trivial on my platform. I want the offset to be printed by the compiler itself, something like:

> gcc main.c
The offset of b is 1

I've tried a few approaches using #pragma message and offsetof, with my closest being:

#define OFFSET offsetof(struct A, b)
#define XSTR(x) STR(x)
#define STR(x) #x

#pragma message "Offset: " XSTR(OFFSET)

which just prints:

> gcc main.c
main.c:12:9: note: #pragma message: Offset: __builtin_offsetof (struct A, b)

which does not print the numeric offset. It's possible to binary-search the offset at compile time by using _Static_assert - but my real structs are big and this can get a bit cumbersome.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
Daniel Kleinstein
  • 5,262
  • 1
  • 22
  • 39
  • I don't have a C compiler here, so I can't really tell if this works or not. I would try to declare a variable `struct A x` (perhaps in a section that will be discarded later), then try with `&x.b - &x`. But I think it will not work (also I suspect this is UB, because pointer subtracting should be done on pointers of "equal classes"; well this construct used to be scattered throughout the Linux kernel years ago; I don't know now). – rslemos Sep 16 '21 at 13:02
  • 1
    Can't you write a separate test program that prints the values that you need? – stark Sep 16 '21 at 13:17
  • 1
    With GCC and Clang, you can generate “assembly” containing the offset. E.g., put the line `__asm__("# offsetof(struct A, b) = %c0" : : "i" (offsetof(struct A, b)));` inside a function, compile to assembly (`-S` switch), and the generated assembly will contain a line like `## offsetof(struct A, b) = 1`. You can, of course, use `grep` to extract the line from the assembly. – Eric Postpischil Sep 16 '21 at 13:53
  • @stark: Writing a separate program is not generally a correct solution because, to get the correct offset, it should compile for the target platform (otherwise, the offset may differ), but as OP states, getting printed output from the target platform may be difficult. (Even with a packed structure, the offsets will differ if the intervening member types are not the same size in the target platform and the native platform.) – Eric Postpischil Sep 16 '21 at 13:55
  • About your preprocessor approach: I do not think this can be done with the preprocessor since the offset will not be known to the preprocessor, but only to the compiler. – nielsen Sep 16 '21 at 13:58
  • If you generate debugging info and generate ELF format object files (even if the final program image is in a different format) with DWARF format debugging information (or CTF or BTF format), you could use a utility such as `pahole` (from the "dwarves" package) to get the structure offsets. – Ian Abbott Sep 16 '21 at 13:58
  • 2
    Why can't you just check the linker map file to see how large the struct ended up? – Lundin Sep 16 '21 at 14:22
  • 1
    @nielsen You're on the button - my main takeaway from this is that I'm a bit surprised at how limited compile-time messaging is (especially given that functionality like `_Static_assert` exists). It seems hard to print not only offsets but other compiler information like `sizeof`. It seems the best solutions really are either hacky solutions like in the answers that try to leak information via compiler warnings, or by extracting the embedded information from compiled binaries. – Daniel Kleinstein Sep 16 '21 at 15:56
  • What is the purpose here, what are you going to do with this number in the build console output? – hyde Sep 17 '21 at 08:12
  • 1
    @hyde My specific case was some assembly code that needed to dereference a member of the struct with the struct's base address in a register, and I wanted a quick way to know the offset of the member – Daniel Kleinstein Sep 17 '21 at 08:26
  • @DanielKleinstein If it is not a tight, performance-critical loop, getting the offset at runtime from a global `const` variable would be an easy soluton. – hyde Sep 17 '21 at 09:28
  • Does this answer your question? [How can I print the result of sizeof() at compile time in C?](https://stackoverflow.com/questions/20979565/how-can-i-print-the-result-of-sizeof-at-compile-time-in-c) It's basically the same, just with another way of making a compile-time constant. – Ruslan Sep 17 '21 at 14:02

5 Answers5

17

I suspect the stated constraint “I want the offset to be printed by the compiler itself” is an XY problem and that we merely need the offset to be printed by the build tools on the system used for building, not specifically by the compiler.

In this case, GCC and Clang have the ability to include arbitrary text in their assembly output and to include various data operands in that text, including immediate values for structure offsets.

Inside any function, include these lines:

#if GenerateStructureOffsets
    __asm__("# offsetof(struct A, b) = %c0" : : "i" (offsetof(struct A, b)));
#endif

Then compile with the switches -DGenerateStructureOffsets and -S. The compiler will generate a file named SourceFileName.s, and you can use -o Name to give it a different name if desired.

Then grep "## offsetof" Name will find this line, showing something like:

    ## offsetof(struct A, b) = 1

Then you can use sed or other tools to extract the value.

In the __asm__, "i" says to generate an “immediate” operand. The (offsetof(struct A, b)) that follows that gives the value it should have. In the first quoted string, %c0 is replaced with the value of that operand.

The 0 indicates which operand to replace—if there were more than one listed later in the __asm__, they are numbered 0, 1, 2, 3, and so on. (There is also a mechanism for naming them instead of numbering them, not shown here.) Normally, %0 would be replaced by the form of immediate operand suitable for the target assembly language, such as $1 or #1. However, the c modifier says to use the bare constant, so the replacement text is just the value, in this case 1.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • I accepted Luca's answer because I think it more directly answers the question itself (and showcases a useful GCC compiling trick) - but this is a great solution and it's similar to the one I ended up using. – Daniel Kleinstein Sep 16 '21 at 16:00
12

Given this macro:

#define PRINT_OFFSETOF(A, B) char (*__daniel_kleinstein_is_cool)[sizeof(char[offsetof(A, B)])] = 1

Use it into your main() function (or whatever function):

struct Test {
  char x;
  long long y;
  int z;
};

int main(void) {
  PRINT_OFFSETOF(struct Test, z);
  return 0;
}

And you will get this warning:

warning: initialization of ‘char (*)[16]’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]

Hence, offsetof(struct Test, z) == 16.


NOTE: in case offsetof() returns 0 (e.g.: PRINT_OFFSETOF(struct Test, x)), the compiler warning will have char (*)[] instead of char (*)[16].

NOTE 2: I only tested this with GCC.

Luca Polito
  • 2,387
  • 14
  • 20
  • 5
    `*__daniel_kleinstein_is_cool`... Shameless :) – ryyker Sep 16 '21 at 15:17
  • 3
    @ryyker Sometimes I forget that [here at StackOverflow we are not allowed to have fun](https://stackoverflow.blog/2010/01/04/stack-overflow-where-we-hate-fun). – Luca Polito Sep 16 '21 at 15:19
  • 2
    @ryyker, Luca is going for the "accepted answer" ;-) – wovano Sep 16 '21 at 15:28
  • For me, this doesn't quite work (to be fair I'm running an old version of gcc, 7.5.0) - it prints "`initialization makes pointer from integer without a cast`". On compiler explorer I can see that this does work for gcc >= 8.1 (dbush's solution seems to work for all versions). This is still a very nice solution, with the obvious drawback of generating a compiler warning. – Daniel Kleinstein Sep 16 '21 at 15:51
  • @DanielKleinstein I didn't know about the GCC version issue, I may try to fix it. By the way, I don't think it's possible to solve this problem (i.e.: make the compiler print the offsetof during compilation) without generating warnings. It's a shame, however, that GCC doesn't have a `#pragma GCC diagnostic note ...` to convert the warning to a note, in case you're using `-Werror`. – Luca Polito Sep 16 '21 at 15:56
  • 2
    @LucaPolito - the page you linked argues to the contrary. It is perfectly fine to have fun here, as long as it occurs in amounts proportionally less than other _less fun_ attributes of the site. :) And evidently your ploy worked! – ryyker Sep 16 '21 at 16:18
8

Surprisingly, looks like __builtin_choose_expr works inside __deprecated__ function attribute. The following program:

#include <stddef.h>

struct A {
    char a;
    char b;
} __attribute__((packed));

#define printval_case(x, xstr, y, ...)  __builtin_choose_expr(x == y, xstr"="#y, __VA_ARGS__)
#define printval(x) do { \
    __attribute__((__deprecated__( \
        printval_case(x, #x, 0, \
        printval_case(x, #x, 1, \
        printval_case(x, #x, 2, \
        printval_case(x, #x, 3, \
        /* etc... */ \
        (void)0 )))) \
    ))) void printval() {} \
    printval(); \
} while (0)

int main() {
    printval(offsetof(struct A, a));
    printval(offsetof(struct A, b));
}

When compiled, then gcc will output:

<source>:23:30: warning: 'printval' is deprecated: offsetof(struct A, a)=0 [-Wdeprecated-declarations]
<source>:24:30: warning: 'printval' is deprecated: offsetof(struct A, b)=1 [-Wdeprecated-declarations]

In a similar fashion you could embed the value into the executable, (similarly to how CMake detects compiler stuff):

#include <stddef.h>
struct A {
    char a;
    char b;
} __attribute__((packed));
#define printval_case(x, xstr, y, ...)  __builtin_choose_expr(x == y, xstr"="#y, __VA_ARGS__)
#define embedval(x) do { \
    static const __attribute__((__used__)) const char unused[] = \
        printval_case(x, #x, 0, \
        printval_case(x, #x, 1, \
        printval_case(x, #x, 2, \
        printval_case(x, #x, 3, \
        /* etc... */ \
        (void)0 )))); \
} while (0)
int main() {
    embedval(offsetof(struct A, a));
    embedval(offsetof(struct A, b));
}

then:

$ gcc file.c && strings ./a.out | grep offsetof
offsetof(struct A, b)=1
offsetof(struct A, a)=0
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • 1
    Nice trick! I have to give `gcc` the `-fcompare-debug-second` option to avoid being spammed by (not so) helpful notes from the compiler. – nielsen Sep 16 '21 at 14:26
  • Yes. But the macro `printval_case(` is there for readability and could be removed and just inlined - then, it will be just a single one `in expansion of` message. – KamilCuk Sep 16 '21 at 15:08
2

Perhaps a new pre-processing step would be acceptable. This could then be done as a separate step that won't affect your production binary.

offsetdumper.sh

#!/bin/bash
#
# pre-process some source file(s), add a macro + main() and a file with rules
# describing the interesting symbos. Compile and run the result.

dumprulefile="$1"
shift

# Define your own macros, like OFFSET, in the "Here Document" below:
{
gcc -E "$@" && cat<<EOF
#define OFFSET(x,y) do { printf("%s::%s %zu\n", #x, #y, offsetof(x,y)); } while(0)
#include <stddef.h>
#include <stdio.h>
int main() {
EOF
cat "$dumprulefile"
echo '}'
} | g++ -x c - && ./a.out

rules

OFFSET(A,a);
OFFSET(A,b);

source.h

typedef struct {
    char a;
    char b;
} __attribute__((packed)) A;

Example:

$ ./offsetdumper.sh rules *.h
A::a 0
A::b 1

This is a bit fragile and won't work if your source.h includes a main function, so it may need some tinkering to fulfill your needs.

Ted Lyngmo
  • 93,841
  • 5
  • 60
  • 108
  • The problem with this approach is it assumes you are compiling natively. From the OPs description it seems like he is cross-building for an embedded device and you can't assume the structure layout will be the same across different targets. – plugwash Sep 17 '21 at 07:45
  • @plugwash If that's the case, it would need to be crosscompiled and executed on target. A bit cumbersome but often doable. – Ted Lyngmo Sep 17 '21 at 08:04
1

One possible way is to make the offset the size of an array and then pass the address of that array to a function expecting an incompatible pointer so it prints the type:

static int a[offsetof(struct A, b)];
static void foo1(int *p) { (void)p; }
static void foo2(void) { foo1(&a); }

This prints:

x1.c: In function ‘foo2’:
x1.c:13:1: warning: passing argument 1 of ‘foo1’ from incompatible pointer type [enabled by default]
 static void foo2(void) { foo1(&a); }
 ^
x1.c:12:13: note: expected ‘int *’ but argument is of type ‘int (*)[1]’
 static void foo1(int *p) { (void)p; }
             ^
dbush
  • 205,898
  • 23
  • 218
  • 273
  • A neat trick, but I guess it will break when the offset is 0. Also, I guess OP wants a message below warning level. – Ian Abbott Sep 16 '21 at 13:31
  • Interesting hack. :-) You could give the functions a speaking name, both to give a good example and so that the output says something about what is printed. – Dr. Hans-Peter Störr Sep 18 '21 at 07:31