1

I use the STB library to load images into memory. The specific function, stbi_load, returns a pointer to an unsigned char, which is an array.

I'm tempted to use the new C++17 API for raw data, std::byte, which would allow me to be more expressive, and let me alias the raw data pixel by pixel, or color by color by casting it to different data type (integers of different size).

Now I tried this:

std::unique_ptr<std::byte[], stbi_deleter>(stbi_load(...));

Of course it didn't work due to the lack of implicit conversion.

Then I tried that:

std::unique_ptr<std::byte[], stbi_deleter>(
  static_cast<std::byte*>(stbi_load(...))
);

Again, it still didn't worked. I had to resolve to use reinterpret_cast instead. And made me question whether this conversion is legal or not. Can I legally convert a unsigned char* to std::byte* according to the strict aliasing rule? And then can I cast the data to another datatype like std::uint32_t* and mutate it? Will that also break the aliasing rule?

Guillaume Racicot
  • 39,621
  • 9
  • 77
  • 141
  • 3
    I have yet to see any utility for std::byte. Does anyone have a good example of where it might be useful? –  Sep 05 '18 at 22:53
  • > "which would allow me to be more expressive, and let me alias the raw data pixel by pixel." I'm pretty sure that's still UB. to_integer() is basically just a wrapper around memcpy. –  Sep 05 '18 at 23:15
  • Note an `unsigned char` array is already ideal for accessing the data pixel by pixel. (Not sure what you mean by "alias" there). The advantages of `std::byte` are making it less likely you'll accidentally use it in some types of arithmetic, and as you mentioned, expressing the nature of the data more obviously. – aschepler Sep 05 '18 at 23:26
  • @NeilButterworth I think the main point of it is to disable implicit conversions, in comparison to use of `unsigned char` – M.M Sep 05 '18 at 23:29
  • @NeilButterworth Its usefulness is thin but not absent; even just looking at streaming it (versus the gymnastics required when using the semantically-overloaded `char`) is a benefit in itself. Arguably. I'd say it's mostly a self-documentation thing tbh. – Lightness Races in Orbit Sep 05 '18 at 23:29
  • @M.M Most C++ and C code depends on completely harmless implicit integer conversions (unless we are Haskell programmers, of course). std::byte seems to involve so many problems with explicit conversions (vide this question) that I still doubt its utility. I would like to see an example of a medium to large scale C++ library or executable where std::byte was used and found to be worthwhile. –  Sep 05 '18 at 23:33
  • @NeilButterworth Personally I agree with you, although many people seem to be strongly opposed to implicit conversions – M.M Sep 05 '18 at 23:34

2 Answers2

6

The strict aliasing rule never forbids any pointer conversions. It is about the type of an expression accessing an object.

std::byte may alias any other type, this is mentioned in the cppreference page you linked, as well as in the strict aliasing rule in the Standard of course (C++17 basic.lval/8.8). So it is fine to use reinterpret_cast<std::byte *> and then read or write the array of unsigned char.

If you use an expression of type uint32_t to read or write an array of unsigned char, that would violate the strict aliasing rule.

M.M
  • 138,810
  • 21
  • 208
  • 365
4

Can I legally convert a unsigned char* to std::byte* according to the strict aliasing rule?

Yes, that's specifically why std::byte "inherits" from unsigned char. But you have to go through reinterpret_cast<> much like you have to when casting an arbitrary type to char* or unsigned char*

And then can I cast the data to another datatype like std::uint32_t* and mutate it.

No. You cannot do anything with std::byte that you cannot with char* or unsigned char*.

The main thing std::byte seems to be useful for is to have functions that can have both string and raw-data overloads.

Also, it gets rid of the following annoyance:

char val = foo();
std::cout << (int)val << "\n";
  • 2
    `std::byte` certainly does not inherit from `unsigned char`, if for no other reason than such a thing is impossible. The latter is the underlying type of the former, however, which is a scoped enum. – Lightness Races in Orbit Sep 05 '18 at 23:28
  • (Admittedly the distinction seems largely academic in the grand scheme of things) – Lightness Races in Orbit Sep 05 '18 at 23:28
  • 1
    It's not relevant that `unsigned char` is the underlying type of `std::byte`, since there is no general allowance for aliasing between an enumeration and its underlying type in either direction. Instead, there's a specific allowance for accessing any object via `std::byte`. – aschepler Sep 05 '18 at 23:32