12

I am having problems with std::string..

Problem is that '\0' is being recognized as end of the string as in C-like strings.

For example following code:

#include <iostream>
#include <string>

int main ()
{
    std::string s ("String!\0 This is a string too!");
    std::cout << s.length(); // same result as with s.size()
    std::cout << std::endl << s;

    return 0;
}

outputs this:

7
String!

What is the problem here? Shouldn't std::string treat '\0' just as any other character?

Rakete1111
  • 47,013
  • 16
  • 123
  • 162
galaxyworks
  • 321
  • 1
  • 2
  • 10
  • 3
    Why is it a problem? – klutt Mar 18 '17 at 12:59
  • 1
    Because c++ string shouldn't be null terminated (i think) and should be treated like any other character – galaxyworks Mar 18 '17 at 13:01
  • Yes, but that does not answer my question. :) Why do you have a \0 in the string in the first place? It's not a printable character anyway. – klutt Mar 18 '17 at 13:02
  • 1
    Because I didn't make the string. For example if someone passes an array of chars to a function that takes string as a parameter (for example std::string mystring(std::string s) {...} and if someone to this function passes something like this: mystring("String!\0 This is a string too!"). length of that string will stop at \0 – galaxyworks Mar 18 '17 at 13:13
  • @galaxyworks: Then the caller of that function shouldn't do that. – Christian Hackl Mar 18 '17 at 13:24
  • 1
    Possible duplicate of [How do you construct a std::string with an embedded null?](http://stackoverflow.com/questions/164168/how-do-you-construct-a-stdstring-with-an-embedded-null) – Rakete1111 Mar 18 '17 at 14:36
  • The Answer Is Already At http://stackoverflow.com/questions/164168/how-do-you-construct-a-stdstring-with-an-embedded-null/42876357#42876357 – Mohammad Tayyab Mar 18 '17 at 17:08
  • @ChristianHackl tell that to Win32 developers! :D – coolhandle01 Jun 10 '19 at 12:42

7 Answers7

17

Think about it: if you are given const char*, how will you detemine, where is a true terminating 0, and where is embedded one?

You need to either explicitely pass a size of string, or construct string from two iterators (pointers?)

#include <string>
#include <iostream>


int main()
{
    auto& str = "String!\0 This is a string too!";
    std::string s(std::begin(str), std::end(str));
    std::cout << s.size() << '\n' << s << '\n';
}

Example: http://coliru.stacked-crooked.com/a/d42211b7199d458d

Edit: @Rakete1111 reminded me about string literals:

using namespace std::literals::string_literals;
auto str = "String!\0 This is a string too!"s;
Revolver_Ocelot
  • 8,609
  • 3
  • 30
  • 48
  • 6
    Why not use string literals? `auto str = "String!\0 This is a string too!"s;` – Rakete1111 Mar 18 '17 at 13:21
  • This answer is closest to what I was looking for! thanks :) – galaxyworks Mar 18 '17 at 14:49
  • 2
    The interesting corollary question is why isn't the `std::string` constructor overloaded to work with string literals to handle embedded nul chars? From your answer it is clear that the compiler itself is obviously not in the least confused about where the string ends, otherwise `std::end` would give an incorrect answer too. It is only when going to the standard library that the information gets downgraded to a simple `const char *` and therefore lost. – user4815162342 Mar 18 '17 at 15:15
  • @user4815162342 I think, it is to prevent problems like "_why my string is full of garbage characters?_". When I hear _char array_, I think _buffer_. In most cases you do not fill whole buffer: usually it is created large enough to contain any string written in it and then passed to some function, which fills it. So, I believe, there was more demand for treating char arrays like c-strings. – Revolver_Ocelot Mar 18 '17 at 15:22
  • @user4815162342: That's an excellent question. All it would take is a simple `template basic_string(CharT const (&array)[Size]);`. I'm sure there's a good reason, though. Strangely enough, I cannot find any discussion about this anywhere on SO. – Christian Hackl Mar 19 '17 at 10:07
  • @ChristianHackl Note that, to be useful, the overload should explicitly not include the terminating nul. This is also an issue with the code in the answer - the created string will contain two nul chars (and the in-memory representation even three), where the OP expected only one. – user4815162342 Mar 19 '17 at 11:14
4

Your std::string really has only 7 characters and a terminating '\0', because that's how you construct it. Look at the list of std::basic_string constructors: There is no array version which would be able to remember the size of the string literal. The one at work here is this one:

basic_string( const CharT* s,
              const Allocator& alloc = Allocator() );

The "String!\0 This is a string too!" char const[] array is converted to a pointer to the first char element. That pointer is passed to the constructor and is all information it has. In order to determine the size of the string, the constructor has to increment the pointer until it finds the first '\0'. And that happens to be one inside of the array.


If you happen to work with a lot zero bytes in your strings, then chances are that std::vector<char> or even std::vector<unsigned char> would be a more natural solution to your problem.

Christian Hackl
  • 27,051
  • 3
  • 32
  • 62
3

You are constructing your std::string from a string literal. String literals are automatically terminated with a '\0'. A string literal "f\0o" is thus encoded as the following array of characters:

{'f', '\0', 'o', '\0'}

The string constructor taking a char const* will be called, and will be implemented something like this:

string(char const* s) {
    auto e = s;
    while (*e != '\0') ++e;

    m_length = e - s;
    m_data = new char[m_length + 1];
    memcpy(m_data, s, m_length + 1);
}

Obviously this isn't a technically correct implementation, but you get the idea. The '\0' you manually inserted will be interpreted as the end of the string literal.

If you want to ignore the extra '\0', you can use a std::string literal:

#include <iostream>
#include <string>

int main ()
{
    using namespace std::string_literals;

    std::string s("String!\0 This is a string too!"s);
    std::cout << s.length(); // same result as with s.size()
    std::cout << std::endl << s;

    return 0;
}

Output:

30
String! This is a string too!
Joseph Thomson
  • 9,888
  • 1
  • 34
  • 38
1

\0 is known as a terminating character so you'll need to skip it somehow.

String represntation

Take that as an example.

So whenever you want to skip special characters you would like to use two backslashes "\\0"

And '\\0' is a two-character literal

   std::string test = "Test\\0 Test"

Results :

   Test\0 Test

Most beginners also make mistake when loading eg. files :

 std::ifstream some_file("\new_dir\test.txt"); //Wrong
 //You should be using it like this : 
 std::ifstream some_file("\\new_dir\\test.txt"); //Correct
lowarago
  • 88
  • 8
0

In very few words, you're constructing your C++ string from a standard C string.

And standard C strings are zero-terminated. So, your C string parameter will be terminated in the first \0 character it can find. And that character is the one you explicitly provided in your string "String!\0 This is a string too!"

And not in the 2nd one that is implictly and automatically provided by the compiler in the end of your C standard string.

Hilton Fernandes
  • 559
  • 6
  • 11
-1

That's not a problem, that's the intended behavior.

Maybe you could elaborate why you have a \0 in your string.

Using a std::vector would allow you to use \0 in your string.

Simon
  • 1,616
  • 2
  • 17
  • 39
  • 1
    `std::string` is just fine with `'\0'` byte in the string, no need to use `std::vector` just because of it. – hyde Mar 18 '17 at 13:28
-2

Escape your \0

std::string s ("String!\\0 This is a string too!");

and you will get what you need:

31
String!\0 This is a string too!
Random Guy
  • 1,095
  • 16
  • 29