-1

I am working on a project where I need to create the address range of many, many global variables in C (C++ not possible), with clang. For symbols of complete types, this is easy in a standard-compliant way:

typedef struct range {
    void* begin;
    void* end;
} range;

extern int foo;
range foo_range = { &(&foo)[0], &(&foo)[1] };

But as I said, it works because the C compiler statically knows the size of foo, so it's able to resolve &(&foo)[1] as foo+4 bytes (assuming that sizeof(int) is 4, of course). This won't work for symbols of an incomplete type:

struct incomplete;
struct trailing_array {
    int count;
    int elements[];
};

extern int foo[];
extern struct incomplete bar;
extern struct trailing_array baz;

range foo_range = { &(&foo)[0], &(&foo)[1] };
// error: foo has incomplete type

range bar_range = { &(&bar)[0], &(&bar)[1] };
// error: bar has incomplete type

range bar_range = { &(&baz)[0], &(&baz)[1] };
// this one compiles, but the range excludes the elements array

However, it's not a problem for me to describe these symbols some more. For instance, I can easily add metadata:

// foo.h
extern int foo[];
extern size_t foo_size;

// foo.c
int foo[] = {1,2,3};
size_t foo_size = sizeof(foo);

Except that this won't help my problem for references outside of foo.c, because foo_size is not a compile-time constant, and therefore this wouldn't work:

range foo_range = { &foo, (void*)&foo + foo_size };
// error: foo_size not a compile-time constant

What would work, however, is getting the address of a symbol that ends right where my object ends. For instance, if I define foo with this assembly code:

_foo:
    .long 1
    .long 2
    .long 3
_foo_end:

Then, in my C code, I can have:

extern int foo[];
extern int foo_end;
range foo_range = { &foo, &foo_end };

and that effectively solves my problem.

However, while I have the flexibility to add symbols, I don't have the flexibility to rewrite every global declaration as a file-level assembly statement. So, my question is: what is the closest that I can get to that using clang?

  • I know that I can use sections (since the linker makes start and end symbols for sections), but one section per global variable would be way overkill.
  • I know that I can't just take the address of a variable immediately after the global whose range I want to get, because the compiler has been known to reorder globals in some cases.

I'm specifically using Apple's linker, but if you have a solution that works for GNU ld/gold or lld, I'll still take it and see if I can get it to work here too.

zneak
  • 134,922
  • 42
  • 253
  • 328
  • 1
    `int elements[]` is not an extension, is a [flexible array member](https://en.wikipedia.org/wiki/Flexible_array_member) (introduced in C99) – David Ranieri Aug 02 '18 at 19:46
  • 1
    @KeineLust thanks for the info! – zneak Aug 02 '18 at 19:46
  • I think more information about what you're trying to do is gonna be necessary. – Nicholas Pipitone Aug 02 '18 at 19:57
  • I can't see any solution other than some linker magic (sections as you said). Otherwise the notion of "incomplete type" would not be needed. – Eugene Sh. Aug 02 '18 at 20:07
  • `(void*)&foo + foo_size` is a problem as it is attempting to do pointer math on a `void *`. `(char*)&foo + foo_size` would make more sense. – chux - Reinstate Monica Aug 02 '18 at 20:45
  • How close does `int foo[] = {1,2,3}; int *end = *(&foo + 1); ` meet your goal? – chux - Reinstate Monica Aug 02 '18 at 20:50
  • @EugeneSh., I think that this is ultimately what will have to happen. I'm fairly certain that I can't resolve this just from standard C. This is why I also specified the compiler and the linker, as I can afford to solve the problem only for a specific platform. – zneak Aug 02 '18 at 21:12
  • @chux, arithmetic on `void*` is well-defined in C. Regarding your second comment, this works only if `foo[]` has a known size, which is the case in the TU that defines it (if you have `int foo[] = {1,2,3}`), but not in TUs that only see `extern int foo[]`. – zneak Aug 02 '18 at 21:13
  • @zneak Disagree about arithmetic on `void*` is well-defined in C: [ref](https://stackoverflow.com/a/1864376/2410359). C11 §6.5.6 3 _Additive operators_ has "the left operand is a pointer to a complete object type and the right operand has integer type", yet `void*` is expressly "The `void` type comprises an empty set of values; it is an incomplete object type that cannot be completed." §6.2.5 19. – chux - Reinstate Monica Aug 02 '18 at 21:23
  • @chux, you are right, this is a [C extension](https://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Pointer-Arith.html#Pointer-Arith) which I can use. – zneak Aug 02 '18 at 21:25

1 Answers1

0

Hm, there's no real way to do it if you're defining it in another translation unit. If you want, you can include a types.h file with contents

struct incomplete {
    char data[SIZE];
}

Where SIZE is whatever integer you please, and do the same for each global variable. This will intersect future definitions however. Really, you'd have to go with

#define INCOMPLETE_SIZE 5

And then use that for range = { &bar, (void*)&bar + INCOMPLETE_SIZE }.

"Incomplete type" is just some standards terminology for properly describing how to parse

struct A {
     A* ptr;
}

As far as I know, they don't really get used otherwise.

I also don't recommend &(&foo)[0], &(&foo)[1] as a way to get a range of pointers, it's very esoteric / hard to read. Much preferred is &foo, &foo + 1. You can see how one can turn this into a solution for bar by doing &bar, (void*)&bar + SIZE, where SIZE is some constant you have to specify somewhere in the code (Either by declaring it and using sizeof / &foo+1 solution, or defining the SIZE with a #define)

Nicholas Pipitone
  • 4,002
  • 4
  • 24
  • 39
  • I don’t need to link that file with the complete type, as these are perfectly valid forward declarations that can have their definition in another translation unit. – zneak Aug 02 '18 at 20:00
  • If you have the definition in another translation unit, there's just no way for the compiler to know the size. You can use a constant, as shown in my `#define` example. There aren't really any other solutions. – Nicholas Pipitone Aug 02 '18 at 20:03
  • The problem you had with `{ &foo, (void*)&foo + foo_size }` is because you used `size_t foo_size`, and that `foo_size` is not a constant. Just do `{ &foo, (void*)&foo + sizeof(foo)}` or `{ &foo, &foo + 1 }`, it'll work just fine. – Nicholas Pipitone Aug 02 '18 at 20:13
  • No, it does not. Sizeof only works on complete types, and it never works on structures with trailing arrays. That was all stated in the question. – zneak Aug 02 '18 at 20:16
  • Well I was assuming `foo` was a complete type from `size_t foo_size = sizeof(foo);`, else the error would happen right there. (There seemed to be multiple usage of the word `foo`, it got a bit confusing) – Nicholas Pipitone Aug 02 '18 at 20:17
  • `foo` is complete in the translation unit that defines it, and incomplete everywhere else. I clarified the question. – zneak Aug 02 '18 at 21:14