2

I came across some code which has several instances of the bellow example:

std::string str = "This  is my string";
char* ptr = const_cast<char*>(str.c_str());

ptr[5] = 'w';
ptr[6] = 'a';

In this simplified example there is an assignment of std::string::c_str(), which returns const char*, to a char* pointer using const_cast.

This seems to work as intended and str is modified accordingly.

But the description of std::string::c_str() in my local library reads as follows:

Return const pointer to null-terminated contents. This is a handle to internal data. Do not modify or dire things may happen.

And in cppreference the description includes:

Returns a pointer to a null-terminated character array with data equivalent to those stored in the string.

...

Writing to the character array accessed through c_str() is undefined behavior.

In the standard [string.accessors] the descritpion is similar and no information about this ...null-terminated character array with data equivalent to those stored in the string... is provided.

Instead of clarifying the issue, this confused me further, what character array? How is it equivalent? Is it a copy? And if so why is modifying it, also modifying the original string? Is this implementation defined?

I would like to avoid the task of altering all the instances of the code where this is used, so my question is:

Is there any way this can be correct, can it be legal to use ptr pointer to modify str string?

anastaciu
  • 23,467
  • 7
  • 28
  • 53
  • 3
    nope, just because it happens to work doesn't mean its not undefined behaviour, change `c_str` to `data` to fix it – Alan Birtles Sep 07 '20 at 17:46
  • @AlanBirtles, that is my understanding also, but I was hopping that there was some way this could be correctly used. – anastaciu Sep 07 '20 at 17:48
  • 1
    In older standards, `std::string` was much more free-form than it is after C++11. The implementation was allowed to do (but rarely did) extremely strange things to produce the character array returned by `c_str`. – user4581301 Sep 07 '20 at 17:48
  • 1
    @anastaciu Always be extremely careful what you're doing with `const_cast` (it's often enough as bad as `reinterpret_casr`). – πάντα ῥεῖ Sep 07 '20 at 17:48
  • @πάνταῥεῖ, yes, my original instintcts are to steer clear of this type of construct, unfortulnately I will have to deal with it nevertheless. – anastaciu Sep 07 '20 at 17:50
  • 4
    Use `str.data()` instead, it returns a non-const pointer. Or `&str.front()` or `&str[0]`, if you are sure the string is not empty. All these produce a modifiable pointer to the string's buffer. – Igor Tandetnik Sep 07 '20 at 17:50
  • @anastaciu [Does this answer your question](/questions/29347041/is-there-a-way-to-set-the-length-of-a-stdstring-without-modifying-the-buffer-c)? Or [this](https://stackoverflow.com/questions/25169915/is-writing-to-str0-buffer-of-a-stdstring-well-defined-behaviour-in-c11)? – PaulMcKenzie Sep 07 '20 at 17:51
  • Yes, the standard is quite clear that it's UB. But I wouldn't expect it to break in real world. – HolyBlackCat Sep 07 '20 at 17:51
  • @IgorTandetnik, yes, it does seem that that is the correct approach. – anastaciu Sep 07 '20 at 17:54
  • Meh. `vector` + `string_view` tbh – Asteroids With Wings Sep 07 '20 at 17:56
  • @PaulMcKenzie, more or less, though those are nice links they don't mention the use of `c_str`. – anastaciu Sep 07 '20 at 17:57
  • @HolyBlackCat, yes it does seem that way. – anastaciu Sep 07 '20 at 17:58
  • @AsteroidsWithWings, I'm content with the easiest fix, I regret having found this in the first place :) – anastaciu Sep 07 '20 at 18:01
  • 1
    Well I mean in general :P – Asteroids With Wings Sep 07 '20 at 18:10
  • @user4581301, I missed your comment, thanks for the insight, in any case, with the array of options provided to do this in a similar but correct way, it's just strange why this was coded like this. – anastaciu Sep 07 '20 at 20:06

1 Answers1

7

The reference for c_str that you cited is quite clear on the matter:

The program shall not modify any of the values stored in the character array; otherwise, the behavior is undefined.

The fact that it happens to work doesn't mean anything. Undefined behavior means the program could do the "right" thing if it wants to.


If you do want to modify the underlying data, you can use the non-const overload of data(), which is available from C++17, which allows you to modify all but the null-terminator:

The program shall not modify the value stored at p + size() to any value other than charT(); otherwise, the behavior is undefined.

cigien
  • 57,834
  • 11
  • 73
  • 112
  • Well spotted, it does seem a good way to go, my goal is to mess as little as possible wth the existing code. – anastaciu Sep 07 '20 at 17:52
  • 1
    Yes, simply replacing `c_str` with `data` should work. (though I believe this is only true from c++17). – cigien Sep 07 '20 at 17:54
  • Thanks, I'll have to check that. – anastaciu Sep 07 '20 at 17:55
  • Are you sure about that? because the standard clearly said: "Modifying the character array accessed through the const overload of data has undefined behavior." Since c++ 11/14 only have data() return const char*, the standard forbid to use data() to modify underlying string. – Dexter Feb 08 '21 at 19:45
  • 1
    @Dexter If the string is not const, then the non-const overlad of `data` is called, and modifying the contents through that is perfectly valid I think. – cigien Feb 08 '21 at 19:51
  • @cigien but c++11/14 does not have non-const overload of `data`. – Dexter Feb 08 '21 at 19:53
  • 2
    @Dexter Oh, I see what you mean. Good point, I'll edit the answer to clarify. – cigien Feb 08 '21 at 19:54
  • @cigien then I think the only choice for c++11/14 should be `&str[0]`. Might also worth to mention :) – Dexter Feb 08 '21 at 20:00
  • @Dexter Hmm, true, but I prefer not adding solutions that need to work with older standards, unless the OP explicitly asks for it. I think C++17 is sufficiently widespread enough, that future visitors will not need an older solution. I'm an optimist, YMMV :) – cigien Feb 08 '21 at 20:02