1

Consider I have 2 unsigned numbers, of 32 bits each, saved in a single array. The first number is contained in positions [0; 3] and the second in positions [4; 8]. I now which to change the value of one of the numbers, is the following code allowed/problematic?

uint8_t array[8];
//...Fill it up...

uint32_t *ptr = NULL;
ptr = (uint32_t*)&array[0];
*ptr = 12345;

ptr = (uint32_t*)&array[4];
*ptr = 54321;
AmiguelS
  • 805
  • 2
  • 10
  • 28
  • 5
    yes, it is problematic, http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule – Giorgi Moniava Sep 08 '16 at 10:52
  • 2
    You violate effective type (aka strict aliasing) rule. This is a clear **don't**! Use marshalling with bitshifts/masking. – too honest for this site Sep 08 '16 at 11:12
  • 1
    you can do it the other way around though : have an array of `uint32_t`, which you fill up with the `uint32_t` values. You can then "read" that array with a `uint8_t*` (as long as you're not bothered by the data representation). – Sander De Dycker Sep 08 '16 at 11:55
  • @SanderDeDycker: That is also a bad idea, as it does not account for implementation-specifics. What problem do people have with shifting?? – too honest for this site Sep 08 '16 at 12:17
  • @Olaf : I don't have a problem with bit shifting (and would prefer it personally - although I'd use functions like `htonl` rather than doing so manually), but given the caveat I added about the value representation (which are the implementation specifics you're referring to) there's no problem with the approach I mentioned either. – Sander De Dycker Sep 08 '16 at 12:50
  • @SanderDeDycker Such code is almost always used for mashalling. A well defined data representation is important for this, so of course the data-representation **does** matter. Re 'htonl` etc.: They are not available on most targets C is used for. They are not even part of the C standard. – too honest for this site Sep 08 '16 at 12:52
  • 1
    @Olaf : I don't know what the OP wants to use this for - for all I know, he just wants to do a hex dump. Or maybe the data representation is always the same in the scenario's where the code will be used. All this to say that there are valid uses for both the approach you mentioned, and the approach I mentioned. There's no reason to disparage either of them. – Sander De Dycker Sep 08 '16 at 12:57
  • @SanderDeDycker: As I wrote: marshalling (you might want to read about it!). As your approach has portability issues and relies on UB for one direction, it very well should not be used. Implemented with a common pattern the shift-variant has good chances the compiler optimizes the code well. It has no drawback like yours. Another point is that typical I/O functions take a `uint8_t *`, not an `uint32_t *` etc. So using the correct buffer type helps, too. – too honest for this site Sep 08 '16 at 13:05
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/122908/discussion-between-sander-de-dycker-and-olaf). – Sander De Dycker Sep 08 '16 at 13:08

1 Answers1

1

You may not access an uint8_t array with a pointer to uint32_t. That's a violation of the strict aliasing rule (the other way around would be ok - if uint8_t is a character type).

Instead, you might want to use "type punning" to circumvent the C (C99 and above) type system. For that, you use a union with members of the respective types:

union TypePunning {
  uint32_t the_ints[2];
  uint8_t the_bytes[2 * sizeof(uint32_t)];
}
// now e.g. write to the_bytes[1] and see the effect in the_ints[0].
// Beware of system endianness, though!
Community
  • 1
  • 1
Daniel Jour
  • 15,896
  • 2
  • 36
  • 63
  • `uint8_t` is never a character type! It always is an unsigned integer type. And the other way in not ok. See 6.5p6, 1st sentence! And while not invoking UB, the union-approach has also its issues. – too honest for this site Sep 08 '16 at 13:06
  • @Olaf, better to say that `uint8_t` is an integer type and `char` is not. The standard has no broader category of "character types". In particular, `signed char` and `unsigned char` are not grouped with `char` to form such a category; the signed / unsigned types are grouped among the integer types. – John Bollinger Sep 08 '16 at 13:17
  • @JohnBollinger: 6.2.5p6, last sentence: "... The standard and extended unsigned integer types are collectively called _unsigned integer types_". `uint8_t` cannot be anything else than `unsigned char` which is a _standard unsigned integer type_. – too honest for this site Sep 08 '16 at 13:19
  • @Olaf, yes, but the standard has no collective term "character types". That's all I'm saying. It's nitpicky. – John Bollinger Sep 08 '16 at 13:23
  • 1
    @JohnBollinger: Oh, you were extending on my comment. Sorry, I didn't get that. But you are wrong - when it comes to nitpicking;-). See 6.2.5p15: "The three types char, signed char, and unsigned char are collectively called the **character types**." But it has no distince character type like Pascal/Modula/etc. which cannot be used as integer without explicit conversion, if that's what you mean. – too honest for this site Sep 08 '16 at 13:26
  • @Olaf, well, darn. I *looked* for that, and somehow overlooked it. – John Bollinger Sep 08 '16 at 13:29
  • "You may not access an uint8_t array with a pointer to uint32_t." Is this true even though the information contained in the array is actually a 32bit number split into 4 pieces? Is this a compiler limitation? Because in terms of memory arrangement, since all the bytes of the array are contiguous, it seems to be a valid operation. – AmiguelS Sep 08 '16 at 16:06