65

std::byte is a new type in C++17 which is made as enum class byte : unsigned char. This makes impossible to use it without appropriate conversion. So, I have made an alias for the vector of such type to represent a byte array:

using Bytes = std::vector<std::byte>;

However, it is impossible to use it in old-style: the functions which accept it as a parameter fail because this type can not be easily converted to old std::vector<unsigned char> type, for example, a usage of zipper library:

/resourcecache/pakfile.cpp: In member function 'utils::Bytes resourcecache::PakFile::readFile(const string&)':
/resourcecache/pakfile.cpp:48:52: error: no matching function for call to 'zipper::Unzipper::extractEntryToMemory(const string&, utils::Bytes&)'
     unzipper_->extractEntryToMemory(fileName, bytes);
                                                    ^
In file included from /resourcecache/pakfile.hpp:13:0,
                 from /resourcecache/pakfile.cpp:1:
/projects/linux/../../thirdparty/zipper/zipper/unzipper.h:31:10: note: candidate: bool zipper::Unzipper::extractEntryToMemory(const string&, std::vector<unsigned char>&)
     bool extractEntryToMemory(const std::string& name, std::vector<unsigned char>& vec);
          ^~~~~~~~~~~~~~~~~~~~
/projects/linux/../../thirdparty/zipper/zipper/unzipper.h:31:10: note:   no known conversion for argument 2 from 'utils::Bytes {aka std::vector<std::byte>}' to 'std::vector<unsigned char>&'

I have tried to perform naive casts but this does not help also. So, if it is designed to be useful, will it be actually useful in old contexts? The only method I see is to use std::transform for using new vector of bytes in these places:

utils::Bytes bytes;
std::vector<unsigned char> rawBytes;
unzipper_->extractEntryToMemory(fileName, rawBytes);
std::transform(rawBytes.cbegin(),
               rawBytes.cend(),
               std::back_inserter(bytes),
               [](const unsigned char c) {
                   return static_cast<std::byte>(c);
               });
return bytes;

Which is:

  1. Ugly.
  2. Takes a lot of useless lines (can be rewritten but still it needs to be written before:)).
  3. Copies the memory instead of just using already created chunk of rawBytes.

So, how to use it in old places?

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
VP.
  • 15,509
  • 17
  • 91
  • 161
  • 2
    Possible duplicate: [How to use something like `std::basic_istream`](https://stackoverflow.com/questions/43735918/how-to-use-something-like-stdbasic-istreamstdbyte) – Paul R Sep 11 '17 at 08:23
  • Can you not change your function parameter from `std::vector&` to `std::vector&`? – Galik Sep 11 '17 at 08:26
  • 5
    @Galik this is not my function.. – VP. Sep 11 '17 at 08:46
  • 4
    [This discussion](https://news.ycombinator.com/item?id=13955624) contains some solid explanation for (on one side), yet also _hate of_ (on the other side), `std::byte`. I tend to agree with the _anti_-`std::byte` side more, as creating or using `std::byte` is just another case of C++ going too far in the name of "safety" (ie: type safety), which I find to be the cause of a great deal of confusion and clutter and mess in C++, further convoluting things. My preference would be to literally _never_ use `std::byte`, and so long as my C++ colleagues don't disagree too strongly, that's what I'll do. – Gabriel Staples Aug 12 '20 at 17:59
  • Note: my specialty is embedded software. I'm not sure that matters though in regards to my opinion, but maybe it helps shape how I think. – Gabriel Staples Aug 12 '20 at 18:05

4 Answers4

71

You're missing the point why std::byte was invented in the first place. The reason it was invented is to hold a raw byte in memory without the assumption that it's a character. You can see that in cppreference.

Like char and unsigned char, it can be used to access raw memory occupied by other objects (object representation), but unlike those types, it is not a character type and is not an arithmetic type.

Remember that C++ is a strongly typed language in the interest of safety (so implicit conversions are restricted in many cases). Meaning: If an implicit conversion from byte to char was possible, it would defeat the purpose.

So, to answer your question: To use it, you have to cast it whenever you want to make an assignment to it:

std::byte x = (std::byte)10;
std::byte y = (std::byte)'a';
std::cout << (int)x << std::endl;
std::cout << (char)y << std::endl;

Anything else shall not work, by design! So that transform is ugly, agreed, but if you want to store chars, then use char. Don't use bytes unless you want to store raw memory that should not be interpreted as char by default.

And also the last part of your question is generally incorrect: You don't have to make copies, because you don't have to copy the whole vector. If you temporarily need to read a byte as a char, simply static_cast it at the place where you need to use it as a char. It costs nothing, and is type-safe.


As to your question in the comment about casting std::vector<char> to std::vector<std::byte>, you can't do that. But you can use the raw array underneath. So, the following has a type (char*):
std::vector<std::byte> bytes;
// fill it...
char* charBytes = reinterpret_cast<char*>(bytes.data()); 

This has type char*, which is a pointer to the first element of your array, and can be dereferenced without copying, as follows:

std::cout << charBytes[5] << std::endl; //6th element of the vector as char

And the size you get from bytes.size(). This is valid, since std::vector is contiguous in memory. You can't generally do this with any other std container (deque, list, etc...).

While this is valid, it removes part of the safety from the equation, keep that in mind. If you need char, don't use byte.

The Quantum Physicist
  • 24,987
  • 19
  • 103
  • 189
1

If you want something that behaves like a byte in the way you'd probably expect it but is named distinctly different from unsigned char use uint8_t from stdint.h. For almost all implementations this will probably be a

typedef unsigned char uint8_t;

and again an unsigned char under the hood - but who cares? You just want to emphasize "This is not a character type". You just don't have to expect to be able to have two overloads of some functions, one for unsigned char and one for uint8_t. But if you do the compiler will push your nose onto it anyway...

Don Pedro
  • 335
  • 3
  • 7
  • 3
    It's not correct since chat can be 16bits. There is stdint standard header with all int*_t types . – NN_ Oct 12 '18 at 18:07
  • 3
    I think the assumption of a char ("chat"?) with 16 bits is not happening. See this link that explains it: [link](https://gustedt.wordpress.com/2010/06/01/how-many-bits-has-a-byte/) So you can be pretty sure that CHAR_BIT is 8, type char has (by definition!) a sizeof 1 and it can store 8 bits, not 16, on today's machines. – Don Pedro Oct 16 '18 at 08:21
  • 2
    Incorrect. It does happen. For instance TI TMS320C55x: type size (bits) ------------------------------ char 16 short 16 int 16 long 32 long long 40 float 32 double 32 – NN_ Oct 16 '18 at 10:44
  • 1
    OK, admitted there are architectures where a char is more than 8 bits wide. However then: - Let's not discuss about if these architectures are "rare" - As a char is the smallest addressable unit there cannot exist an int8_t - This platform is not POSIX compliant as it requires int8_t - A byte has 8 bits (not 16, this is a word then), see wikipedia - So you do not have any 8 bit wide datatypes at hand - in that case it makes no sense at all to talk about a byte datatype, does it??? Discussion get's rather futile here, doesn't it? – Don Pedro Oct 18 '18 at 08:48
  • 1
    Also, if you read my original posting I recommended to use uint8_t from stdint.h and I cited what you will **likely** (not neccessarily) find in there on common platforms. I did not recommend to do the `typedef unsigned char uint8_t;`yourself as it of course will vary with the respective platform you're targeting (plus you'd pollute the std-namespace which is illegal). And if you're on a platform that does not have 8 bit wide datatypes, be it char, signed char, unsigned char, int8_t and/or uint8_t then you are simply lost. So I cannot see what is wrong/incorrect with the original post. – Don Pedro Oct 18 '18 at 09:01
0

If your old-style code takes ranges or iterators as arguments, you can continue to use those. In the few cases where you cannot (such as explicit range-based constructors), you could in theory write a new iterator class that wraps an iterator to unsigned char and converts *it to std::byte&.

Davislor
  • 14,674
  • 2
  • 34
  • 49
-1

If you really want to do it and you're sure it's safe, you can use a pointer cast:

std::vector<std::byte> v;
void f(std::vector<unsigned char>& v);
f(*std::vector<unsigned char>*(&v));
  • 1
    You shouldn't be casting a pointer to the vector itself, instead, for changing the type of a vectors data, use `static_cast(v.data())` and let the function accept a `std::span` or a raw pointer to the data. – pnda May 12 '22 at 19:26