22

Consider the following test program:

#include <iostream>
#include <string>
#include <vector>

int main()
{
    std::cout << sizeof(std::string("hi")) << " ";
    std::string a[10];
    std::cout << sizeof(a) << " ";
    std::vector<std::string> v(10);
    std::cout << sizeof(v) + sizeof(std::string) * v.capacity() << "\n";
}

Output for libstdc++ and libc++ respectively are:

8 80 104
24 240 264

As you can see, libc++ takes 3 times as much memory for a simple program. How does the implementation differ that causes this memory disparity? Do I need to be concerned and how do I workaround it?

user4390444
  • 231
  • 1
  • 2
  • 3
  • 1
    This is most likely because the libstdc++ `std::string` implementation is not C++11 compliant and uses a copy on write implementation which gives you the size savings. Rerun your libstdc++ test using [`vstring`](https://gcc.gnu.org/onlinedocs/gcc-4.6.2/libstdc++/api/a01118.html) instead and compare the results. – Praetorian Dec 24 '14 at 03:35
  • 16
    To be clear, the ```sizeof(std::string)``` does not represent all the memory occupied by the string, but only the memory occupied by the string class (say, on the stack, for a stack-allocated string), and not any data structures it points to. – EyasSH Dec 24 '14 at 03:38
  • @EyasSH I have measured. It does allocate 80 bytes and 240 bytes respectively. – user4390444 Dec 24 '14 at 03:40
  • 2
    Is that referring to the array measurement? (because that is not accurate either for the same reason mentioned above). – EyasSH Dec 24 '14 at 03:45
  • 1
    [The results](http://coliru.stacked-crooked.com/a/f1914c8b8b530aec) are quite different if you use `__gnu_cxx::__vstring` with libstdc++. And what EyasSH means is that the size of the `std::string` object is not affected by the length of the string it manages. – Praetorian Dec 24 '14 at 03:48
  • note that the larger `sizeof(string)` is, the more efficient it is due to small string optimization opportunities! – M.M Dec 24 '14 at 04:34
  • @Praetorian It is [since 2015](https://stackoverflow.com/a/28002956/2757035); phew :) – underscore_d Jul 01 '17 at 19:34

4 Answers4

66

Here is a short program to help you explore both kinds of memory usage of std::string: stack and heap.

#include <string>
#include <new>
#include <cstdio>
#include <cstdlib>

std::size_t allocated = 0;

void* operator new (size_t sz)
{
    void* p = std::malloc(sz);
    allocated += sz;
    return p;
}

void operator delete(void* p) noexcept
{
    return std::free(p);
}

int
main()
{
    allocated = 0;
    std::string s("hi");
    std::printf("stack space = %zu, heap space = %zu, capacity = %zu\n",
     sizeof(s), allocated, s.capacity());
}

Using http://melpon.org/wandbox/ it is easy to get output for different compiler/lib combinations, for example:

gcc 4.9.1:

stack space = 8, heap space = 27, capacity = 2

gcc 5.0.0:

stack space = 32, heap space = 0, capacity = 15

clang/libc++:

stack space = 24, heap space = 0, capacity = 22

VS-2015:

stack space = 32, heap space = 0, capacity = 15

(the last line is from http://webcompiler.cloudapp.net)

The above output also shows capacity, which is a measure of how many chars the string can hold before it has to allocate a new, larger buffer from the heap. For the gcc-5.0, libc++, and VS-2015 implementations, this is a measure of the short string buffer. That is, the size buffer allocated on the stack to hold short strings, thus avoiding the more expensive heap allocation.

It appears that the libc++ implementation has the smallest (stack usage) of the short-string implementations, and yet contains the largest of the short string buffers. And if you count total memory usage (stack + heap), libc++ has the smallest total memory usage for this 2-character string among all 4 of these implementations.

It should be noted that all of these measurements were taken on 64 bit platforms. On 32 bit, the libc++ stack usage will go down to 12, and the small string buffer goes down to 10. I don't know the behavior of the other implementations on 32 bit platforms, but you can use the above code to find out.

Howard Hinnant
  • 206,506
  • 52
  • 449
  • 577
  • 4
    I am surprised that no other implementation played similar tricks as libc++: ensure that "long" and "short" capacities are odd/even, and use all the bytes not containing this crucial bit as a buffer for small strings (except for one byte that holds the length, which is obviously smaller than 256). It does have a cost, almost all operations need to check if the string is short of long before doing anything (though if you do 2 operations in a row it might be able to test just once). Maybe that compensates for the wasted space? Probably depends... – Marc Glisse Jan 17 '15 at 20:53
  • On Ubuntu 14.04 I get the same results from g++ 4.9.4, g++ 5.4.1, and clang++ 3.6.0, all coinciding with your result for gcc 4.9.1. – Ruslan Mar 07 '18 at 09:34
  • 1
    @Ruslan By default they use the same system-installed standard library. You have to provide some flags to use different standard library, for example `-stdlib=libc++` for Clang to use libc++. – Ilya Popov Apr 02 '18 at 09:54
10

You should not be concerned, standard library implementors know what they are doing.

Using the latest code from the GCC subversion trunk libstdc++ gives these numbers:

32 320 344

This is because as of a few weeks ago I switched the default std::string implementation to use the small-string optimisation (with space for 15 chars) instead of the copy-on-write implementation that you tested with.

Jonathan Wakely
  • 166,810
  • 27
  • 341
  • 521
  • 1
    There is surely nothing wrong to make it conforming, but sadly it suddenly becomes significantly suboptimal in a type-erasure wrapper like `any` whose implementation has a slightly smaller internal capacity than `sizeof(string)` to do small object optimization... and even worse with `pmr::string`. – FrankHB Feb 02 '20 at 19:13
7

Summary: It only looks like libstdc++ uses one char*. In fact, it allocates more memory.

So, you should not be concerned that Clang's libc++ implementation is memory inefficient.

From the documentation of libstdc++ (under Detailed Description):

A string looks like this:

                                        [_Rep]
                                        _M_length
   [basic_string<char_type>]            _M_capacity
   _M_dataplus                          _M_refcount
   _M_p ---------------->               unnamed array of char_type

Where the _M_p points to the first character in the string, and you cast it to a pointer-to-_Rep and subtract 1 to get a pointer to the header.

This approach has the enormous advantage that a string object requires only one allocation. All the ugliness is confined within a single pair of inline functions, which each compile to a single add instruction: _Rep::_M_data(), and string::_M_rep(); and the allocation function which gets a block of raw bytes and with room enough and constructs a _Rep object at the front.

The reason you want _M_data pointing to the character array and not the _Rep is so that the debugger can see the string contents. (Probably we should add a non-inline member to get the _Rep for the debugger to use, so users can check the actual string length.)

So, it just looks like one char* but that is misleading in terms of memory usage.

Previously libstdc++ basically used this layout:

  struct _Rep_base
  {
    size_type               _M_length;
    size_type               _M_capacity;
    _Atomic_word            _M_refcount;
  };

That is closer to the results from libc++.

libc++ uses "short string optimization". The exact layout depends on whether _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT is defined. If it is defined, the data pointer will be word-aligned if the string is short. For details, see the source code.

Short string optimization avoids heap allocations, so it also looks more costly than libstdc++ implementation if you only consider the parts that are allocated on the stack. sizeof(std::string) only shows the stack usage not the overall memory usage (stack + heap).

Philipp Claßen
  • 41,306
  • 31
  • 146
  • 239
  • 1
    libc++ always uses SSO, the _LIBCPP_ALTERNATE_STRING_LAYOUT compile-time macro just changes its layout. – Petr Apr 14 '18 at 17:01
  • @Petr Thank you, I updated it. Also the name of the macro has changed since the answer was written. It is now _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT. – Philipp Claßen Apr 14 '18 at 17:16
2

I haven't checked the actual implementations in source code, but I remember checking this when I was working on my C++ string library. A 24 byte string implementation is typical. If the length of the string is smaller than or equal to 16 bytes, instead of malloc'ing from the heap, it copies the string into the internal buffer of size 16 bytes. Otherwise, it mallocs and stores the memory address etc. This minor buffering actually helps in terms of running time performance.

For some compilers, there's an option to turn the internal buffer off.

mostruash
  • 4,169
  • 1
  • 23
  • 40
  • 6
    In C++11 (and beyond), every use of a shared resource must be thread safe. The heap is a shared resource, and so calls to heap allocation/deallocation routines have to be synchronized. This is (one of) the reason that people want the short string optimization. – Marshall Clow Dec 31 '14 at 05:33