In C++11, the characters of a std::string
have to be stored contiguously, as § 21.4.1/5 points out:
The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size().
However, here is how § 21.4.7.1 lists the two functions to retrieve a pointer to the underlying storage (emphasis mine):
const charT* c_str() const noexcept;
const charT* data() const noexcept;
1 Returns: A pointer p such that p + i == &operator[](i) for each i in [0,size()].
2 Complexity: constant time.
3 Requires: The program shall not alter any of the values stored in the character array.
One possibility I can think of for point number 3 is that the pointer can become invalidated by the following uses of the object (§ 21.4.1/6):
- as an argument to any standard library function taking a reference to non-const basic_string as an argument.
- Calling non-const member functions, except operator[], at, front, back, begin, rbegin, end, and rend.
Even so, iterators can become invalidated, but we can still modify them regardless until they do. We can still use the pointer until it becomes invalidated to read from the buffer as well.
Why can't we write directly to this buffer? Is it because it would put the class in an inconsistent state, as, for example, end()
would not be updated with the new end? If so, why is it permitted to write directly to the buffer of something like std::vector
?
Use cases for this include being able to pass the buffer of a std::string
to a C interface to retrieve a string instead of passing in a vector<char>
instead and initializing the string with iterators from that:
std::string text;
text.resize(GetTextLength());
GetText(text.data());