3
std::string s1;
std::string s2;
assert(strlen(s1.c_str()) == 0);
assert(s1.c_str() == s2.c_str());

Does these two assert always true?

I use C++11, and I have checked the standard, the Table 63 in §21.4.2 says:

data() a non-null pointer that is copyable and can have 0 added to it

size() 0

capacity() an unspecified value

I think c_str() is the same as data(). But I have some question about this defination.

  1. Does "CAN have 0 added to it" == "MUST and ALWAYS have 0 added to it"?

  2. Does all default constructed std::string shared a same underlay buffer?

I test on gcc, these two assert is true. I wonder does these always true for all compiler?

Daniel Langr
  • 22,196
  • 3
  • 50
  • 93
  • 1
    What is your version of GCC and compilation options? With GCC/libstdc++, the second assert should be false since C++11, where SSO is implemented. The same holds for Clang/libc++, and MSVC. Live demo: https://godbolt.org/z/7zj5bdT9v. – Daniel Langr Jun 17 '22 at 06:10
  • 1
    @DanielLangr I am not sure what the default is, but libstdc++ has a configuration switch which decides whether a shared static object is used for empty strings and I think I remember previously seeing issues with it: `--enable-fully-dynamic-string`, see https://gcc.gnu.org/onlinedocs/libstdc++/manual/configure.html. – user17732522 Jun 17 '22 at 06:22

1 Answers1

5

The first assertion is guaranteed to succeed. c_str() always returns a pointer to a null-terminated string with the same string contents as held by the std::string object, which is an empty string for both s1.

The second assertion is not guaranteed to succeed. There is nothing requiring the c_str() returned from a std::string to be the same if the content is the same. Default-constructed strings do not need to share the same underlying buffer. That would be an implementation detail of a particular standard library implementation. (I think libstdc++ does something like this depending on configuration for some backwards-compatibility(?) reasons if I remember correctly, see the --enable-fully-dynamic-string configure option).

Note that prior to C++11, data() did not have the same effect as c_str(). data() was not guaranteed to give a pointer to a null-terminated string. If the string was empty, then the pointer returned by it was not allowed to be dereferenced. So replacing c_str() with data() in your examples would, prior to C++11, result in undefined behavior on the call to strlen.


The wording "and can have 0 added to it" is somewhat weird and I am not completely sure what it is supposed to convey, but for C++11 (draft N3337) data()'s return value is further specified in [string.accessors]/1 so that data() + i == &operator[](i) for all i in the range [0,size()] and operator[] is specified in [strings.access]/2 to return a reference to a CharT() (aka a null character) for operator[](size()) without any conditions.

The strange wording has also been replaced via editorial change in 2018, see https://github.com/cplusplus/draft/pull/1879.

user17732522
  • 53,019
  • 2
  • 56
  • 105