2

Ran a simple program to test the pointer in string object, got

0x1875028
Hello 
0x1875058 0x1875028
Hello world!!!
0x1875028

I am trying to understand why would s.c_str() change value after erase() call but not st.c_str().

Here is the simple code:

#include <vector>
#include <unordered_map>
#include <iostream>
#include <stdlib.h>
#include <string>

using namespace std;

string st;
void dum() {
    string s("Hello world!!!");
    printf("%p\n", s.c_str());
    st = s;
    s.erase(6);
    cout << s << endl;
    printf("%p %p\n", s.c_str(), st.c_str());
}


int main(int argc,char *argv[]) {
    dum();
    cout << st << endl;
    st.erase(6);
    printf("%p\n", st.c_str());
    return 0;
}
packetie
  • 4,839
  • 8
  • 37
  • 72
  • 1
    Are you compiling with a version of gcc before gcc 5? If so, COW strings may be the reason for what you're seeing – Alejandro Jun 08 '15 at 15:46
  • I am using g++ `ver 4.7.3-2ubuntu1~12.04`. The command line for compilation is `g++ -std=c++11 test.cc`. Thanks for pointing out COW. – packetie Jun 08 '15 at 15:49

2 Answers2

1

This actually depends on the version you're using. See, for example Is std::string refcounted in GCC 4.x / C++11?. When you write for two strings, a, and b

a = b;

Then there's a question of whether they're internally pointing to the same object (up until one of them is modified). So either behavior your program exhibits is not very surprising.

Community
  • 1
  • 1
Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
  • Don't feel bad, @Alejandro - I'm like the slowest person in this site, and people beat me all the time. (Or actually, maybe that should make you feel bad?) :-) – Ami Tavory Jun 08 '15 at 15:54
  • @Ami Tavory, in the example ` string b("hello world"); a = b`; Is this statement right?: `b` will not "own" the data ("hello world"), so changing b (like using "erase()" call) will trigger it to allocate memory to store data, `a` owns the data, so changing "a" will not trigger it. – packetie Jun 08 '15 at 15:58
  • @codingFun I looked a bit for the versions of g++ where this happens, but was unsuccessful; the closest I found [was this](https://gcc.gnu.org/ml/gcc/2011-10/msg00115.html). – Ami Tavory Jun 08 '15 at 16:04
  • Thanks @AmiTavory for the link. It feels that reference count theory does explain the behavior. Will look deeper into it later. – packetie Jun 08 '15 at 16:47
0

First of all, I think this goes under the implementation details umbrella.

I tried that with VS2013.

After you call erase(), the string pointer returned by c_str() is not changed because I think the internal string implementation just updates the end of string (changing some internal data member), instead of doing a new heap reallocation for the internal string buffer (such an operation would likely return a new pointer value).

This is a behavior that I noted both for your local s string and the global st string.

Note that the STL implementation that comes with VS2013 doesn't use COW (COW seems to be non-standard C++11 compliant), so when you copy the strings with st = s, you are doing a deep copy, so the two strings are completely independent and they point to different memory buffers storing their respective string contents. So, when you erase something from one string, this operation is in no way reflected to the other copied string.

Sample Code

#include <iostream>
#include <string>

using namespace std;

// Helper function to print string's c_str() pointer using cout
inline const void * StringPtr(const string& str)
{
    // We need a const void* to make cout print a pointer value;
    // since const char* is interpreted as string.
    //
    // See for example: 
    //   How to simulate printf's %p format when using std::cout?
    //   http://stackoverflow.com/q/5657123/1629821 
    //
    return static_cast<const void *>(str.c_str());
}

string st;

void f() {
    string s{"Hello world!!!"};
    cout << "s.c_str() = " << StringPtr(s) << '\n';
    st = s;
    s.erase(6);
    cout << s << '\n';
    cout << "s.c_str() = " << StringPtr(s) 
         << "; st.c_str() = " << StringPtr(st) << '\n';
}

int main() {
    f();
    cout << st << endl;
    st.erase(6);
    cout << "st.c_str() = " << StringPtr(st) << '\n';
}

Output

C:\Temp\CppTests>cl /EHsc /W4 /nologo test.cpp
test.cpp

C:\Temp\CppTests>test.exe
s.c_str() = 0036FE18
Hello
s.c_str() = 0036FE18; st.c_str() = 01009A40
Hello world!!!
st.c_str() = 01009A40
Mr.C64
  • 41,637
  • 14
  • 86
  • 162
  • agree that this question is on " implementation details". Seems that VS2013 is different from G++ ver 4.7.3, where after `st = s;`, both objects point the same internal data. Thanks giving a difference prospective on the implementation. – packetie Jun 08 '15 at 16:47
  • @codingFun Yes, thanks also to you for your answer :) I think both our answers show that it's an implementation detail since we offered different perspectives :) However, I think in C++11 the COW behavior was banned from STL strings, so from a "C++11 correctness" perspective I think VS2013 is doing the right thing, and probably GCC 5 does the same. – Mr.C64 Jun 08 '15 at 16:51