3

My current (simplified) buffer API looks like this:

typedef struct {
    size_t offset;
    size_t size;
    uint8_t *data;
} my_buffer;

// Writes an unsigned int 8 to the buffer
bool my_buffer_write_u8(my_buffer *buffer, uint8_t value) {
    if (buffer->offset >= buffer->size) return false;

    buffer->data[buffer->offset] = value;
    ++buffer->offset;
    return true;
}

However, after refreshing my knowledge about the strict aliasing rule in C I'm not so sure about this use case:

    char string[32];

    my_buffer buffer;
    buffer.size = sizeof(string);
    buffer.data = string; // <-- I think this violates the strict aliasing rule
    buffer.offset = 0;

    // the function calls access buffer.data which is defined to be `uint8_t *` and not `char *`
    // in other words, I'm manipulating a `char *` through a `uint8_t *`:
    // even though uint8_t is almost always unsigned char, it is nevertheless not the same as unsigned char
    my_buffer_write_u8(&buffer, 'h');
    my_buffer_write_u8(&buffer, 'e');
    my_buffer_write_u8(&buffer, 'l');
    my_buffer_write_u8(&buffer, 'l');
    my_buffer_write_u8(&buffer, 'o');
    my_buffer_write_u8(&buffer, '\0');

I think I should be using void * in the buffer struct and use a (char *) cast to access the underlying data:

typedef struct {
    size_t offset;
    size_t size;
    void *data;
} my_buffer;

// Writes an unsigned int 8 to the buffer
bool my_buffer_write_u8(my_buffer *buffer, uint8_t value) {
    if (buffer->offset >= buffer->size) return false;

    unsigned char *data = (unsigned char *)buffer->data;

    data[buffer->offset] = value;
    ++buffer->offset;

    return true;
}

Because char *, unsigned char * and signed char * are always assumed to alias other datatypes.

The same cannot be said about uint8_t * (according to the standard that is)

If CHAR_BIT is 8 then this adjusted code with (void *) should do exactly the same as with the uint8_t version.

Now to the question: have I applied the rule of strict aliasing correctly?

Marco
  • 7,007
  • 2
  • 19
  • 49

1 Answers1

1

It would be UB if uint8_t was different from unsigned char.Assuming uint8_t exists it is very unlikely because

However, the standard does not explicitly require that uint8_t is the same type as unsigned char. Therefore it's rather implementation defined.

Consider applying solution from following thread to check it forementioned types are the same. How to assert two types are equal in c?

It is preferable to use char*/unsigned char* for accessing the data. However, if refactoring of the code would be cumbersome then just add a check if the types uint8_t and unsigned char are the same and reject compilation if not.

tstanisl
  • 13,520
  • 2
  • 25
  • 40
  • Thank you for your answer. I'm aware that such a scenario is very unlikely. However, I don't want to get in the habit of writing code that has technically undefined behaviour. It's very hard to make assumptions in C. I guess the latter code (with char) is fine then? – Marco Oct 25 '21 at 00:25
  • 1
    @marco-a, it's preferable to use `char*` however if refactoring is cumbersome you can just add an assertions that `uint8_t` and `unsigned char` are the same – tstanisl Oct 25 '21 at 07:32
  • Awesome. I'll take the time to refactor my code. – Marco Oct 25 '21 at 18:13
  • 1
    @marco-a: Just about every non-trivial program for a freestanding implementation relies upon behaviors that will generally be defined identically by nearly all freestanding implementations for a given platform, but upon which the Standard imposes no requirements. The only people who should care about things that are "technically" UB are those who want to bend over backward to accommodate Gratuitously Clever Compilers or Crazy Language-Abusing Nonsense Generators who interpret the phrase "non-portable or erroneous" as "non-portable, and therefore erroneous". – supercat Nov 02 '21 at 21:06