48

For regular C strings, a null character '\0' signifies the end of data.

What about std::string, can I have a string with embedded null characters?

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
WilliamKF
  • 41,123
  • 68
  • 193
  • 295
  • 1
    See [std::string equivalent for data with NULL characters?](http://stackoverflow.com/questions/1534335/stdstring-equivalent-for-data-with-null-characters) – Matthew Flaschen May 16 '10 at 22:46

4 Answers4

51

Yes you can have embedded nulls in your std::string.

Example:

std::string s;
s.push_back('\0');
s.push_back('a');
assert(s.length() == 2);

Note: std::string's c_str() member will always append a null character to the returned char buffer; However, std::string's data() member may or may not append a null character to the returned char buffer.

Be careful of operator+=

One thing to look out for is to not use operator+= with a char* on the RHS. It will only add up until the null character.

For example:

std::string s = "hello";
s += "\0world";
assert(s.length() == 5);

The correct way:

std::string s = "hello";
s += std::string("\0world", 6);
assert(s.length() == 11);

Storing binary data more common to use std::vector

Generally it's more common to use std::vector to store arbitrary binary data.

std::vector<char> buf;
buf.resize(1024);
char *p = &buf.front();

It is probably more common since std::string's data() and c_str() members return const pointers so the memory is not modifiable. with &buf.front() you are free to modify the contents of the buffer directly.

Brian R. Bondy
  • 339,232
  • 124
  • 596
  • 636
  • 2
    In C++9x `&s.front()` is also modifiable and guaranteed to point at a contiguous buffer. While there was no such guarantee in C++03, there are no known C++ implementations for which it didn't hold true in practice (which is partly why it was added to C++0x so quickly). – Pavel Minaev Jul 22 '10 at 23:43
  • 11
    Note that as of C++11, `.c_str()` and `.data` are synonyms. In particular, this means that the string returned by `.data` must have a null terminator appended. – nneonneo Feb 06 '13 at 23:49
  • @PavelMinaev: I presume "C++9x" was a typo for "C++0x" (which became C++11 some time after you posted your comment). – Keith Thompson Nov 03 '15 at 21:13
  • `s.append("\0world", 6);` is better than `s += std::string("\0world", 6);` – n.caillou Dec 11 '17 at 02:08
7

Yes. A std::string is just a vector<char> with benefits.

However, be careful about passing such a beast to something that calls .c_str() and stops at the 0.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
  • 1
    The first is not true, as I recently learned. Vector's swap preserves iterators and references to contents, string's not necessarily. http://stackoverflow.com/questions/25201758/stringswap-complexity-under-visual-studio – Notinlist Aug 12 '14 at 13:38
  • @Notinlist: It has a different name, too! Oh the horror – Lightness Races in Orbit May 21 '17 at 11:28
1

You can, but why would you want to? Embedding NUL in an std::string is just asking for trouble, because functions to which you pass an std::string may very well use it's c_str() member, and most will assume that the first NUL indicates the end of the string. Hence this is not a good idea to do. Also note that in UTF-8, only '\0' will result in a 0, so even for i18n purposes, there is no justification for embedding NULs.

Michael Aaron Safyan
  • 93,612
  • 16
  • 138
  • 200
  • Thank you for explaining why *not* to do it. – Snoop Mar 13 '17 at 13:42
  • 3
    No, it's silly. "Don't use the full range of `std::string`'s functionality, because you _might_ pass the result of `c_str()` to C-string functions without also passing a length", really? Well, if you never do that, you'll be fine... – Lightness Races in Orbit May 21 '17 at 11:27
-1

Yep this is valid.

You can have a null character in the middle of the string.

However, if you use a std::string with a null character in the middle with a c string function your in undefined behaviour town - and nobody wants to be there!!!:

 int n = strlen( strWithNullInMiddle.c_str() ); // Boom!!!
Robben_Ford_Fan_boy
  • 8,494
  • 11
  • 64
  • 85
  • 20
    `strlen` will just return the number of characters before the first null. It might be unanticipated behavior, but it's not undefined. – Matthew Flaschen May 16 '10 at 22:47