2

In C, I want to place a char id at the very end of a struct so that I can discern the struct type from a pointer to the end of the struct (allocated dynamically). Obviously, the possibility of padding at the end makes this difficult. I thought of two approaches.

The first approach is to place an array of chars that extends all the way to the end of the struct so that (char*)ptr_to_end - 1 always point to a valid char. I think this should work if the compiler is not doing any funny business. Otherwise, it should fail to compile:

typedef struct
{
    int foo;
    int bar;
    char type;
} MyStructDummy;

typedef struct
{
    int foo;
    int bar;
    char type[ sizeof( MyStructDummy ) - offsetof( MyStructDummy, type ) ];
} MyStruct;

_Static_assert(
    sizeof( MyStruct ) == sizeof( MyStructDummy ),
    "Could not ensure char at end of MyStruct"
);

The second approach is to use offsetof to always access the malloc-ed bloc as individual (member) variables and never as a complete struct. That way, we avoid ever imparting the struct's type as an effective type over the whole block or accidentally changing padding values:

typedef struct
{
    int foo;
    int bar;
    char type;
} MyStruct;

int *MyStruct_foo( void *end_ptr )
{
    return (int*)( (char*)end_ptr - sizeof( MyStruct ) + offsetof( MyStruct, foo ) );
}

int *MyStruct_bar( void *end_ptr )
{
    return (int*)( (char*)end_ptr - sizeof( MyStruct ) + offsetof( MyStruct, bar ) );
}

char *MyStruct_type( void *end_ptr )
{
    return (char*)end_ptr - 1;
}

Is either of these approaches preferable to the other? Is there an existing C idiom that achieves what I want to achieve (I can't use a flexible array member because I want to maintain C++ compatability)?

Thanks!

EDIT:

Karl asked how placing an id at the end of a struct could be useful. Consider this memory-conserving implementation of a dynamic array/vector:

//VecHdr is for vectors without an automatic element destructor function
//and whose capacity is < UINT_MAX
typedef struct
{
    alignas( max_align_t )
    unsigned int size;
    unsigned int cap;
    char type_and_flags; //At very end
} VecHdr; //Probable size: 16 bytes

//VecHdr is for vectors with an element destructor or whose capacity is >= UINT_MAX
typedef struct
{
    alignas( max_align_t )
    size_t size;
    size_t cap;
    void (*element_destructor_func)( void* );
    char type_and_flags; //At very end
} VecHdrEx; //Probable size: 32 bytes

//...

int *foo = vec_create( int );
//This macro returns a pointer to a malloced block of ints, preceded by a VecHdr

int *bar = vec_create_ex( int, my_element_destructor );
//This macro returns a pointer to malloced block of ints, preceded by a VecHdrEx

vec_push( foo, 12345 );
//vec_push knows that foo is preceded by a VecHdr by checking (char*)foo - 1
//foo's VecHdr may eventually be replaced with a VecHdrEx if we add enough elements

vec_push( bar, 12345 );
//vec_push knows that bar is preceded by a VecHdrEx by checking (char*)foo - 1
Jackson Allan
  • 727
  • 3
  • 11
  • 4
    "*I want to place a `char` id at the very end of a struct*" - why? "*so that I can discern the struct type from a pointer to the end of the struct*" - why are you accessing the struct from the end and not from the front? – Remy Lebeau Dec 03 '21 at 04:23
  • @RemyLebeau The idea is to implement dynamic containers by placing a header *before* the pointer suppied to the user. See the first paragraph [here](https://stackoverflow.com/questions/70192823/placing-a-header-before-a-malloc-ed-block-pointer-arithmetic-and-undefined-beha). – Jackson Allan Dec 03 '21 at 04:31
  • 1
    @JDormer I think the memory usage should be the same in both approaches. Either that memory falls in the ```char``` array or is consumed by padding. – Jackson Allan Dec 03 '21 at 04:34
  • 1
    I still can't understand what *problem you are trying to solve* with this approach. Please show an example of code where you otherwise run into a problem due to not having the necessary information about a pointer, and justify why you could be in that situation. – Karl Knechtel Dec 03 '21 at 04:36
  • @KarlKnechtel I just updated the original question with a possible application. – Jackson Allan Dec 03 '21 at 05:06
  • 3
    I would place a constant-size main header block before the user-visible array, and another optional header block with additional data before the main header block. Its presence and size is determined by the type field in the main block. – n. m. could be an AI Dec 03 '21 at 05:44
  • 1
    offsetof seems sufficient. At that point it shouldn't matter where you put the char. You could also use an int (or two) to virtually guarantee alignment, although that's not guaranteed – Mad Physicist Dec 03 '21 at 06:26

1 Answers1

1

There will only be padding at the end if the last member is something misaligned, like a small integer type.

However, if you make the last member a flexible array member of character type, it will always be placed on top of such padding bytes, because the struct doesn't take a flexible array member in account when determining size and padding.

Example:

typedef struct
{
    int foo;
    int bar;
    char type[];
} MyStructDummy;

MyStructDummy* dummy = malloc (sizeof *dummy + 1);
printf("Size: %zu\n", sizeof(MyStructDummy));
printf("Address of struct:%p\n", dummy);
printf("Address of type:%p\n", dummy->type);

This is gives something like:

Size: 8
Address of struct:0x4072a0
Address of type:0x4072a8

If we add an extra member to ensure that there's padding at the end:

typedef struct
{
    int foo;
    int bar;
    char causing_padding;
    char type[];
} MyStructDummy;

Then the very same code as above prints:

Size: 12
Address of struct:0x16f22a0
Address of type:0x16f22a9

So here the compiler did add padding but it lets us use byte 9 for data. We end up allocating memory beyond the flexible array member. Now, we could instead allocate the flexible array member to cover all of the padding:

size_t trailing_padding = sizeof(MyStructDummy) - offsetof(MyStructDummy, type);
MyStructDummy* dummy = malloc (sizeof *dummy + trailing_padding);

This still leaves type at address 9 but it now takes up 3 bytes. We could memset all of them with whatever code you wish to place there. This is well-defined and portable. Full example:

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <string.h>

typedef struct
{
    int foo;
    int bar;
    char causing_padding;
    char type[];
} MyStructDummy;

int main (void)
{
  size_t trailing_padding = sizeof(MyStructDummy) - offsetof(MyStructDummy, type);
  MyStructDummy* dummy = malloc (sizeof *dummy + trailing_padding);
  memset(dummy->type, 42, trailing_padding); // write code 42 to all bytes
  
  printf("Size: %zu\n", sizeof(MyStructDummy));
  printf("Address of struct:%p\n", dummy);
  printf("Address of type:%p\n", dummy->type);

  unsigned char* endptr = (unsigned char*)dummy + sizeof(*dummy) - 1;
  printf("Value of last byte: %d", *endptr);
}

Output:

Size: 12
Address of struct:0xa842a0
Address of type:0xa842a9
Value of last byte: 42
Lundin
  • 195,001
  • 40
  • 254
  • 396