3

Is it possible to tune std::string Small/Short String Optimization somehow?

For example, if I have to work with a lot of quite short strings but longer than 15 chars (like lastname + ", " + firstname + ", " + middlename which length is usually in range of [20; 40]).

Upd:

According to this looks like the answer is no. But when I opened basic_string.h file I've found this:

template<typename _CharT, typename _Traits, typename _Alloc>
    class basic_string
    {
    ...
    enum { _S_local_capacity = 15 / sizeof(_CharT) };

    union
    {
  _CharT           _M_local_buf[_S_local_capacity + 1];
  size_type        _M_allocated_capacity;
    ...
    };

So now it's not clear why _S_local_capacity is hardcoded this way...

Community
  • 1
  • 1
stas.yaranov
  • 1,797
  • 10
  • 17
  • Before you prematurely optimize your strings you might want to profile your program to see if it's truly a bottleneck. Yes, it is possible to make a string class more efficient if it's under a certain size but it's unlikely to be a source of concern except in very specific cases. –  Nov 18 '15 at 07:19
  • 5
    C++ has no SSO. It is a specification. Implementations may use SSO as a method of implementing the specification. – n. m. could be an AI Nov 18 '15 at 07:20
  • Further to what n.m.'s said - it's your particular compiler/implementation's docs that *might* mention their SSO design decisions and any tuning ability. The code you've listed shows pretty clearly that it's hard coded though. – Tony Delroy Nov 18 '15 at 08:08
  • It's taken from `/usr/include/c++/5/bits/basic_string.h`, gcc version 5.2.1 20151010 (Ubuntu 5.2.1-22ubuntu2) – stas.yaranov Nov 18 '15 at 08:11
  • 2
    One consideration is that if the `string` is 16 bytes, there will be room for 4 of them in a typical 64 byte cache line. If you change the size to 40+, only a single string will fit. It is not at all given that this will be an improvement for your program. – Bo Persson Nov 18 '15 at 08:35

1 Answers1

3

The whole idea with "short string optimisation" is that it "takes no extra space. So the size is calculated such that the local buffer overlays other variables in the class that are used when the string is longer.

It is a bad idea to modify system headers, since they are often compiler version dependent, and there are implementation details that make it "binary incompatible".

As the comment says, make sure this really is a problem (performance or otherwise) before doing anything about it. And then consider carefully what you should do about it. What problem are you trying to fix, and are you sure it's worth it. Remember that if you do something like:

 std::string func(std::string arg)
 {
   ...
 }

you will copy more bytes on passing arg on the stack. And no, it doesn't really help making it const std::string& arg if your calling code makes a temporary string, e.g. func("Name: " + name);. And if you do vector<std::string>, the size of each will be bigger, so the vector will take more space - even for the cases where the string STILL don't fit, so more time will be taken when you grow/shrink the vector.

And I think the right solution, once you have made a decision, is to implement your own string class. std::string is a standard template library class, they are not extendable, and you aren't supposed to modify the standard library header files, as, like I said earlier, it's highly compiler dependent. It will be quite some work to make it completely compatible with std::string, but you could of course "cheat" and make a converter function operator std::string() for your string class, so you only need to produce the more basic functions that std::string offers.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227