13

I believe I've found a way to achieve something like the well-known "struct hack" in portable C89. I'm curious if this really strictly conforms to C89.

The main idea is: I allocate memory large enough to hold an initial struct and the elements of the array. The exact size is (K + N) * sizeof(array_base_type), where K is chosen so that K * sizeof(array_base_type) >= sizeof(the_struct) and N is the number of array elements.

First, I dereference the pointer that malloc() returned to store the_struct, then I use pointer arithmetic to obtain a pointer to the beginning of the array following the struct.

One line of code is worth more than a thousand words, so here is a minimal implementation:

typedef struct Header {
    size_t length;
    /* other members follow */
} Header;

typedef struct Value {
    int type;
    union {
        int intval;
        double fltval;
    } v;
} Value;

/* round up to nearest multiple of sizeof(Value) so that a Header struct fits in */
size_t n_hdr = (sizeof(Header) + sizeof(Value) - 1) / sizeof(Value);

size_t n_arr = 42; /* arbitrary array size here */
void *frame = malloc((n_hdr + n_arr) * sizeof(Value));

if (!frame)
    return NULL;

Header *hdr = frame;
Value *stack_bottom = (Value *)frame + n_hdr;

My main concern is that the last two assignments (using frame as both a pointer to Header and a pointer to Value) may violate the strict aliasing rule. I do not, however, dereference hdr as a pointer to Value - it's only pointer arithmetic that is performed on frame in order to access the first element of the value array, so I don't effectively access the same object using pointers of different types.

So, is this approach any better than the classic struct hack (which has been officially deemed UB), or is it UB too?

  • I don't know about the conformance of this solution, but it's not really new to me. ;) – Daniel Kamil Kozar Aug 19 '13 at 11:49
  • @DanielKamilKozar It may not be new :) It's not that advanced or anything... and I don't care either, I just need to get this work. I needed something like flexible struct members (from C99) in C89 but the struct hack is non-conformant. So I thought about it a bit and all I could come up with was this. –  Aug 19 '13 at 11:58
  • 2
    There is a good discussion of strict aliasing at http://dbp-consulting.com/tutorials/StrictAliasing.html . Basically, the compiler has to be sure that modifying 'stack_bottom' will not also modify 'frame' (and vice-versa). You should be able to do this with "(Value *) ((char *) frame + n_hdr)". – mkfs Aug 19 '13 at 14:09
  • @mkfs That's a very useful link, thanks! As to the code, did you mean `(Value *)((char *)frame + n_hdr * sizeof(Value))`? –  Aug 19 '13 at 14:44
  • @mkfs Furthermore, is it indeed "modifying `stack_bottom`" or "modifying the object pointed to by `stack_bottom`"? (The point is, the same objects are not pointed to by different pointers, only different ones.) –  Aug 19 '13 at 14:52
  • Ah yes, there should be a "n_hdr * sizeof(Value)" in there -- I did not entirely grasp what the initialization of stack_bottom was doing, at first. As for the compiler, yes it is concerned with modifications to the object pointed to by stack_bottom -- the actual location to which the type "Value" is applied. – mkfs Aug 20 '13 at 18:11
  • @mkfs After all, I'll just go with a `union { Header hdr; Value val; }` and a pointer to it will be the stack pointer. The first element is the frame header, the remaining ones are the values. –  Aug 20 '13 at 18:18
  • That should work nicely. And somehow, it feels like cheating ;) – mkfs Aug 20 '13 at 18:30

1 Answers1

5

The "obvious" (well... not exactly obvious, but it's what comes to my mind anyway :-) ) way to cause this to break is to use a vectorizing compiler that somehow decides it's OK to load, say, 64 Headers into a vector register from the 42-rounded-up-to-64+ area at hdr which comes from malloc which always allocates enough to vectorize. Storing the vector register back to memory might overwrite one of the Values.

I think this vectorizing compiler could point to the standard (well, if a compiler has fingers...) and claim conformance.

In practice, though, I'd expect this code to work. If you come across a vectorizing compiler, add even more space (do the rounding up with a machine-dependent macro that can insert a minimum) and charge on. :-)

torek
  • 448,244
  • 59
  • 642
  • 775
  • This is a nice example (+1). One more thing, wouldn't this cause @mkfs's suggestion (aliasing through a character type) to break as well? Also, could it be possible to prevent such a vectorizing compiler from vectorizing the hell out of my brain by aliasing `hdr` using a pointer to array of 1 (one) `Header` (i. e. `Header (*)[1]`) or something similar? –  Aug 19 '13 at 15:23
  • I don't know for sure, I've never actually worked with vectorizing C compilers. I believe they exist for some Cray and Convex systems but don't know how they choose when to do this sort of thing. – torek Aug 19 '13 at 15:39
  • Okay, thanks. I'm waiting for mkfs's response before accepting your answer. The explanation is appreciated. I think I'll go with my original solution or with the `char *` aliasing one, and I will try very hard to test my code using various compilers, at multiple optimization levels, trying out combinations of the `-fstrict-aliasing` and `-ftree-vectorize` flags. Also, a big red splash image in the README and a reference to it in the comments will be added, so that users of my library know that *my* code is broken, not theirs. Thanks once again. –  Aug 19 '13 at 15:46