33

Is it guaranteed by the standard that std::string will not give back allocated memory spontaneously if reassigned from a string of a smaller size?

In other words:

std::string str = "Some quite long string, which needs a lot of memory";
str = "";
str = "A new quite long but smaller string"; // Guaranteed to not result in a heap allocation?

I ask because i'm depending on this to avoid heap fragmentation.

underscore_d
  • 6,309
  • 3
  • 38
  • 64
Martin G
  • 17,357
  • 9
  • 82
  • 98
  • if the new string allocation is less than the previous allocation, no allocation occurs. If the new string requires more allocation than the current, a reallocation occurs. Similar to a std::vector. – Samer Tufail Sep 25 '18 at 10:19
  • Strings reuse their buffers when assigned to shorter strings, so in your program there is only one allocation for the string. Unfortunately I can't find conveniently any citation from the standard on mobile – Fureeish Sep 25 '18 at 10:20
  • 1
    Even if a string did "give back memory spontaneously", that is insufficient to avoid heap fragmentation. A string uses an allocator (by default, an object of type `std::allocator`, but that can be changed) to allocate and deallocate memory, and the allocator may use a lower-level mechanism again (e.g. variants of operators `new` and `delete`) to actually allocate and deallocate. If *any* of those steps elect to not release memory to the lower-level layer, there is potential impact on heap fragmentation. – Peter Sep 25 '18 at 10:34
  • _I ask because i'm depending on this to avoid heap fragmentation._ This was exactly the reason why I wrote my own memory _controller_ for `std::string` – Peter VARGA Sep 25 '18 at 12:25
  • 2
    If you need to guarantee this behaviour, you can always used your own custom allocator – doron Sep 25 '18 at 12:26
  • @AlBundy: what's a memory controller? – geza Sep 25 '18 at 13:00
  • 11
    In my own experience, if I have to utter the phrase "I'm dependent on avoiding heap fragmentation," its a very good time to start considering identifying precise low level requirements, and potentially rolling your own allocation routines. – Cort Ammon Sep 25 '18 at 18:06

4 Answers4

35

No guarantee whatsoever.

[string.cons]/36 defines assigning a const char* to an std::string in term of a move-assignment, whose definition is:

[string.cons]/32

basic_string& operator=(basic_string&& str)  noexcept(/*...*/)

Effects: Move assigns as a sequence container, except that iterators, pointers and references may be invalidated.

This shows that the Committee let the implementation choose freely between an invalidating operation and a more conservative one. And to make things even clearer:

[basic.string]/4

References, pointers, and iterators referring to the elements of a basic_­string sequence may be invalidated by the following uses of that basic_­string object:

  • (4.1) as an argument to any standard library function taking a reference to non-const basic_­string as an argument.
  • (4.2) Calling non-const member functions, except operator[], at, data, front, back, begin, rbegin, end, and rend.

I ask because i'm depending on this to avoid heap fragmentation.

std::string takes as template-parameter an allocator. If you're really concerned by a possible heap fragmentation, you could write your own, which with some heuristics could have an allocation strategy suited for your needs.

In practice, most implementations I know of would not reallocate memory in the case of your question. This can be checked by testing and/or checking your implementation doc and eventually source code.

Community
  • 1
  • 1
YSC
  • 38,212
  • 9
  • 96
  • 149
  • 5
    You can imagine how undesirable such a guarantee would be (presuming it was given), if you had the Gettysburg Address stored in a string, and then replaced it with "Hello World." – Cort Ammon Sep 25 '18 at 18:28
13

CPP reference states that assignment to a pointer-to-char

Replaces the contents with those of null-terminated character string pointed to by s as if by *this = basic_string(s), which involves a call to Traits::length(s).

This "as if" actually boils down to an rvalue assignment, so the following scenario is quite possible:

  1. A fresh temporary string is created.
  2. This string steals its contents as via assignment to an rvalue reference.
bipll
  • 11,747
  • 1
  • 18
  • 32
  • 5
    cppreference is usually reliable, but the quoted statement implies that there is a guaranteed buffer replacement, which is bollocks. Other than that it's a good conceptual model. But it's just *marginally* simpler than the standard's description, quoted in YSC's answer, which would therefore be preferable. – Cheers and hth. - Alf Sep 25 '18 at 10:34
  • 6
    @Cheersandhth.-Alf cppreference is paraphrasing the standard [`[string.cons]`](http://eel.is/c++draft/basic.string#string.cons-36) – Caleth Sep 25 '18 at 10:39
  • 5
    @Cheersandhth.-Alf Does it really? It says "as if", which, in Standard, always means "with regard to observable side effects, aside from few those of constructions/destructions". – bipll Sep 25 '18 at 10:46
  • @rustyx it's allowed to do either. You can conceptualise it as the temporary stealing the existing allocation, then returning it. – Caleth Sep 25 '18 at 12:01
  • 1
    @rustyx The standard just talks about the effect. Given that the effect of `x = "hi";` is the same as the effect of `x.assign("hi");`, an implementation would do the efficient thing for both. libstdc++, for instance, has its `operator=(const CharT*)` just directly call `assign` – Barry Sep 25 '18 at 13:40
4

Is it guaranteed by the standard that std::string will not give back allocated memory spontaneously if reassigned from a string of a smaller size?

Regardless of the actual answer (which is "No, no guarantee") - you should employ the following principle: If it's not obvious that it has to be the case, then don't assume it is the case.

In your specific case - if you want tight control of heap behavior, you might not even want to use std::strings at all (maybe; it depends). And you might not want to use the default allocator (again, maybe); and you might want to memoize strings; etc. What you should absolutely do is make fewer assumptions, measure if possible, and have explicit design to ensure your needs are met.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
2

If your strings are short (up to 15 or 22 bytes, depending on the compiler/std lib) and you are using a relatively recent compiler in C++11 or later mode, then you are likely to benefit from the Short String Optimization (SSO). In this case the string contents are not separately allocated on the heap.

This link also contains a lot of details on common implementations and allocation strategies.

However, both of the strings in your example are too long for SSO.

Paul Floyd
  • 5,530
  • 5
  • 29
  • 43
  • Is it guaranteed by the standard? Is SSO guaranteed by the standard? – pipe Sep 26 '18 at 00:40
  • @pipe No it isn't, what Paul implied in: _"you are **likely** to benefit from the Short String Optimization (SSO)"_. – YSC Sep 26 '18 at 07:17
  • @YSC Ok, but the question is rather specific, and OP already knows that it is likely to happen since he depends on it, so I don't understand what this actually answers. – pipe Sep 26 '18 at 08:27
  • @pipe I agree this answer is slightly off, but it was answered right after OP asked their question. The question wasn't tagged [tag:language-lawyer] at that time, and get edited after that. This is answering part of OP concern (mem frag) and provides an interesting insight. I've upvoted it for what it's worth. – YSC Sep 26 '18 at 09:44
  • It's not meant to be a full answer, just a complement for the case of SSO. I think that the link is quite interesting regarding practical measures of std::string performance. – Paul Floyd Sep 26 '18 at 10:52