1

Is it conforming with the standard to pack two objects using the align of the second object to get the final size?

I'm using this approach for a doubly linked list, but extracting the relevant part:

#include <stdio.h>
#include <stdlib.h>
#include <stdalign.h>

struct node
{
    struct node *prev;
    struct node *next;
};

#define get_node(data, szof) ((struct node *)(((char *)data) + szof))

int main(void)
{
    size_t align = alignof(struct node);
    double *data;

    // Round size up to nearest multiple of alignof(struct node)
    size_t szof = (sizeof(*data) + (align - 1)) / align * align;
    // Pack `data` + `struct node`
    data = malloc(szof + sizeof(struct node));
    // Get node using a generic pointer to calculate the offset
    struct node *node = get_node(data, szof);

    *data = 3.14;
    node->prev = NULL;
    node->next = NULL;
    printf("%f\n", *data);
    free(data);
    return 0;
}

Where data can be a pointer to any primitive or composite type.

Rachid K.
  • 4,490
  • 3
  • 11
  • 30
David Ranieri
  • 39,972
  • 7
  • 52
  • 94
  • 1
    IMO, on the question you linked to, in [this answer](https://stackoverflow.com/a/3846579/4756299) GCC is clearing violating the C standard's "common initial sequence" rule. But I'm not sure what you're doing compares to that - you `malloc()` an oversize `double *`, assign a `double` to it, then treat some other part of the memory as a `node`. At best, it's legal and hard to read, understand, and maintain. At worst it's not legal. And given how GCC mucked up the linked common initial sequence that sure looks like legal code, it's playing with the fire of a compiler's crappy behavior – Andrew Henle Jan 26 '21 at 08:41
  • @AndrewHenle you are right, we are comparing different things, removed that part from the question, what makes me doubt is this quote: _all the compilers I can easily get at (viz. GCC and Clang) have converged on a different, narrower interpretation of the common initial sequence rule_ – David Ranieri Jan 26 '21 at 08:51
  • @AndrewHenle _that sure looks like legal code_ or _that sure looks like **ilegal** code_? – David Ranieri Jan 26 '21 at 08:57
  • I do not understand - could you explain the code in `main`? You are doing `struct { double data; struct node node }` but with pointer arithmetic yourself, and you are asking if doing that is valid? – KamilCuk Jan 26 '21 at 09:06
  • @KamilCuk that is! This construction was made to avoid fragmenting the memory as much in a very very large collection with different types, instead of using a `void *data` and two `malloc`s (one for the data and one for the nodes), it calls `malloc` only once packing both objects together. – David Ranieri Jan 26 '21 at 09:15
  • @AndrewHenle Yes the common initial sequence rule seems to be poorly supported overall and it also requires a union of structs, which isn't the case here. But the 6.3.2.3/7 special rule about inspecting an object through a character pointer behaves, and it is consistent with 6.5/7 ("strict aliasing") doing a lvalue access of any object through a character pointer. However, I don't think it is well-defined to go beyond the size of the inspected object, details in my answer below. – Lundin Jan 26 '21 at 12:53

3 Answers3

2

Is it conforming with the standard to pack two objects using the align of the second object to get the final size?

Sure, the code presented is valid.

There's nothing really to write here as it's harder to prove, rather then disprove something. The pointer values are properly aligned for the referenced types, there are no uninitialized memory accesses. If you remember about alignment yourself, then you can write whole programs without ever using struct.

In real code, I advise to make a structure and let the compiler figure it out[1]. We have offsetof.

struct double_and_node {
     double data;
     struct node node;
};

void *pnt = malloc(sizeof(double_and_node));
double *data = (struct node*)((char*)pnt + offsetof(struct double_and_node, data));
struct node *node = (struct node*)((char*)pnt + offsetof(struct double_and_node, data));

I guess you could research container_of and see C11 6.3.2.3p7.

[1] but really, if so, just use the structure anyway...:

struct double_and_node *pnt = malloc(sizeof(double_and_node));
double *data = &pnt->data;
struct node *node = &pnt->node;
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
1

As an extended idea of previous answers, something generic could be done with macros using typeof() and offsetof() defining/using a structure defined on the fly to concatenate a data type with the node structure:

#include <stdio.h>
#include <stddef.h>

struct node
{
    struct node *prev;
    struct node *next;
};

#define LINKED_TYPE_SIZE(data) \
            sizeof(struct { typeof(data) f; struct node node; })

#define LINKED_TYPE_NODE(datap) \
  (struct node *)((char *)(datap) + offsetof(struct { typeof(*(datap)) f; struct node node; }, node))

int main(void)
{

  double v1;

  printf("size of linked double = %zu\n", LINKED_TYPE_SIZE(v1));
  printf("%p, %p\n", &v1, LINKED_TYPE_NODE(&v1));

  int v2;

  printf("size of linked int = %zu\n", LINKED_TYPE_SIZE(v2));
  printf("%p, %p\n", &v2, LINKED_TYPE_NODE(&v2));

  short int v3;

  printf("size of linked short int = %zu\n", LINKED_TYPE_SIZE(v3));
  printf("%p, %p\n", &v3, LINKED_TYPE_NODE(&v3));

  struct foo {
    int f1;
    char f2;
    int f3;
  } foo_struct;

  printf("size of linked foo = %zu\n", LINKED_TYPE_SIZE(foo_struct));
  printf("%p, %p\n", &foo_struct, LINKED_TYPE_NODE(&foo_struct));

  return 0;
}

The execution of the preceding gives the following on a x86_64 Linux desktop:

$ gcc try.c -o try
$ ./try
size of linked double = 24
0x7ffdfbdf50f8, 0x7ffdfbdf5100
size of linked int = 24
0x7ffdfbdf50f4, 0x7ffdfbdf50fc
size of linked short int = 24
0x7ffdfbdf50f2, 0x7ffdfbdf50fa
size of linked foo = 32
0x7ffdfbdf5100, 0x7ffdfbdf5110

N.B.: As typeof() is a non standard function, it is also possible to get rid of it by passing explicitly the type of the data as a parameter to the macros:

#define LINKED_TYPE_SIZE(type) \
            sizeof(struct { type f; struct node node; })

#define LINKED_TYPE_NODE(type, datap) \
  (struct node *)((char *)(datap) + offsetof(struct { type f; struct node node; }, node))
Rachid K.
  • 4,490
  • 3
  • 11
  • 30
  • The question starts with "Is it conforming with the standard..." and the first thing you do is to reach for GCC non-standard extension `typeof`. I don't see how this answers the question. – Lundin Jan 26 '21 at 11:58
1

Well, this is complicated. The ((char *)data) + szof line is arguably invoking undefined behavior depending on alignof(struct node) vs sizeof(double), but it isn't very obvious.

First of all, lets assume that double* data is actually pointing at a double. We would then be allowed to inspect this object through a character type pointer, as per 6.3.2.3/7:

When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

So we may do ((char *)data) + szof while we stick inside the actual double. Otherwise, if we go out of bounds of that double, the above quoted special rule doesn't apply.

Rather, we are supposedly left to the rules of pointer arithmetic, specified by the additive operators. Although these rules expect you to be using the pointed-at type double* and not a char*. Those rules don't really specify what happens when you inspect a double through a char* and go beyond sizeof(double) bytes.

So the ((char *)data) + szof going beyond sizeof(double) is questionable - I think it is undefined behavior no matter how you put it.

Then there's another aspect here... what if the char pointer is pointing at something without a type? The C standard doesn't specify what will happen then. And this is actually what the code is doing.

Because as it happens, data = malloc(szof + sizeof(struct node)); allocates a raw segment with no declared nor "effective type". The 6.5 rules then states that

If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value

And you don't lvalue access the actual memory until *data = 3.14;, in which case the memory gets the effective type double. This happens after the pointer arithmetic.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Thanks for the nice and detailed explanation, just one question: _The ((char *)data) + szof line is arguably invoking undefined behavior depending on alignof(struct node) vs sizeof(double)_ This is my main concern and the reason for the question, in theory the calculation adjusts the offset so that the `struct node` is well aligned, on the other hand the one-past rule does not affect the `char *` type, right? Can you elaborate a bit more on why the possible undefined behavior? – David Ranieri Jan 26 '21 at 12:13
  • 1
    @DavidRanieri Strictly speaking, C does not define pointer arithmetic on objects that aren't arrays (where scalars are treated like arrays of size 1). The quoted rule about inspecting any type through a character pointer doesn't fit the normal rules of pointer arithmetic. Rather it just says "up to the size of the object". If you keep going beyond that, then it isn't specified anywhere what will happen. If you'd store all of this in a struct and then place that struct in union with a `char []` array though, then it would be another story. – Lundin Jan 26 '21 at 12:29