1

My copy of a draft C++ standard (labelled "ISO/IEC JTC1 SC22 WG21 N3690 Date: 2013-05-15") has the following definition for basic_string::c_str() and basic_string::data().

const charT* c_str() const noexcept; 
const charT* data() const noexcept;

Returns: A pointer p such that p + i == &operator[](i) for each i in [0,size()].

Complexity: constant time.

Requires: The program shall not alter any of the values stored in the character array.

It appears, then, that the following C++ program has undefined behaviour, as it trips over the requirement from c_str():

#include <string>
int main() {
  std::string foo = "foo";
  foo.c_str();
  foo[2] = 'p';
}

This seems breathtakingly stupid. Have I misread the standard, or is this requirement on c_str a relic from a bygone era?

tmyklebu
  • 13,915
  • 3
  • 28
  • 57
  • possible duplicate of [Why is modifying a string through a retrieved pointer to its data not allowed?](http://stackoverflow.com/questions/14290795/why-is-modifying-a-string-through-a-retrieved-pointer-to-its-data-not-allowed) – Pradhan Feb 08 '15 at 06:45
  • @Pradhan: Nice find. But the answer to that question states that modifying the string via the `&foo[0]` pointer is perfectly valid, which, from the plain language of the standard, it isn't. I shot the answerer a comment to that effect. – tmyklebu Feb 08 '15 at 06:53
  • Not sure if there are some guidelines to reading the standard which would enforce the interpretation that modifying it via `&foo[0]` is valid. I agree that the wording supports your interpretation. At least according to my interpretation :) – Pradhan Feb 08 '15 at 06:55

2 Answers2

3

The particular phrasing is a relic from the C++03-era specification that permitted copy-on-write strings. At some point in the past the spec for c_str() read:

Returns: A pointer to the initial element of an array of length size() + 1 whose first size() elements equal the corresponding elements of the string controlled by *this and whose last element is a null character specified by charT().

Requires: The program shall not alter any of the values stored in the array. Nor shall the program treat the returned value as a valid pointer value after any subsequent call to a non-const member function of the class basic_string that designates the same object as this.

in which context the requirement made a lot more sense. If c_str() returned a pointer to a string shared between different std::strings, modifying the values in the array would be really bad.

In C++14, this prohibition makes very little sense. Reading it as prohibiting modifying the string at all after a c_str() call won't make much sense, as you pointed out; reading it as prohibiting modifying the string through the returned pointer would make slightly more sense, but not much. There's no real reason why the semantics should be different between the pointer returned by c_str() and the pointer obtained using &operator[](0).

T.C.
  • 133,968
  • 17
  • 288
  • 421
  • OK. I gather that this is nonsense and that I should learn how to file a defect report? (And that this has always been specified in a buggy way that assumed a particular uncommon implementation?) – tmyklebu Feb 08 '15 at 06:34
  • @tmyklebu COW strings used to be pretty common in C++03 standard libraries. But yes, I think this is a defect. – T.C. Feb 08 '15 at 06:40
0

Your interpretation is wrong. The array pointed to by the char * returned should not be modified.

This isn't allowed

#include <string>
int main() 
{
    std::string foo = "foo";
    char * ptr = (char *)foo.c_str();
    ptr[2] = 'p'; // undefined
}

The original string can be modified but that will invalidate the return of c_str

user93353
  • 13,733
  • 8
  • 60
  • 122
  • Under "Returns:", it specifies that the pointer returned by `foo.c_str()` is equal (in the `==` sense) to the pointer returned by `&foo[0]`. Thus my code is doing exactly what is forbidden. Also, `c_str()` returns a `const char *`, not a `char *` as you state. – tmyklebu Feb 08 '15 at 06:18
  • @tyklebu It doesn't say that foo.c_str() == &foo[0]. It says that foo.c_str()[0] == foo[0]. That allows c_str() and data() to create/manage a copy of the string contents, and return a pointer to the copy. – Rob Feb 08 '15 at 06:53
  • 1
    @Rob: No; `foo.c_str() == &foo[0]` is the special case `i = 0` of `c_str() + i == &foo[i]`. Also, the copying business is forbidden because `c_str()` must take constant time. – tmyklebu Feb 08 '15 at 06:57