1

Why allocate() and deallocate() has not been called when using self-defined allocator with std::string?

Here is the code snippet for demo(https://coliru.stacked-crooked.com/a/bcec030e8693f7ae):

#include <cstring>
#include <iostream>
#include <limits>

#define NOMINMAX
#undef max

template <typename T>
class UntrackedAllocator {
public:
    typedef T value_type;
    typedef value_type* pointer;
    typedef const value_type* const_pointer;
    typedef value_type& reference;
    typedef const value_type& const_reference;
    typedef std::size_t size_type;
    typedef std::ptrdiff_t difference_type;

public:
    template<typename U>
    struct rebind {
        typedef UntrackedAllocator<U> other;
    };

public:
    inline explicit UntrackedAllocator() {}
    inline ~UntrackedAllocator() {}
    inline UntrackedAllocator(UntrackedAllocator const&) {}
    template<typename U>
    inline explicit UntrackedAllocator(UntrackedAllocator<U> const&) {}

    //    address
    inline pointer address(reference r) {
        return &r;
    }

    inline const_pointer address(const_reference r) {
        return &r;
    }

    //    memory allocation
    inline pointer allocate(size_type cnt,
        typename std::allocator<void>::const_pointer = 0) {
        std::cout << "allocate()" << std::endl;
        T *ptr = (T*)malloc(cnt * sizeof(T));
        return ptr;
    }

    inline void deallocate(pointer p, size_type cnt) {
        std::cout << "deallocate()" << std::endl;
        free(p);
    }

    //   size
    inline size_type max_size() const {
        return std::numeric_limits<size_type>::max() / sizeof(T);
    }

    // construction/destruction
    inline void construct(pointer p, const T& t) {
        new(p) T(t);
    }

    inline void destroy(pointer p) {
        p->~T();
    }

    inline bool operator==(UntrackedAllocator const& a) { return this == &a; }
    inline bool operator!=(UntrackedAllocator const& a) { return !operator==(a); }
};

typedef std::basic_string<char, std::char_traits<char>, UntrackedAllocator<char>> String;

int main() 
{
    String str { "13" };
    String copy = str;
    const char* cstr = str.c_str();
    int out = atoi(cstr);

    std::basic_string<char, std::char_traits<char>, UntrackedAllocator<char> > str1("hello world");

    std::cout << str1 << std::endl;

    std::basic_string<char, std::char_traits<char>, UntrackedAllocator<char> > str2;

    str2 = "hi";

    std::basic_string<char, std::char_traits<char>, UntrackedAllocator<char> > longStr =  str1 + str2;

    std::cout << longStr << std::endl;

}

Here is the outputs(no more indeed):

hello world
hello worldhi
John
  • 2,963
  • 11
  • 33
  • 6
    SSO? https://coliru.stacked-crooked.com/a/f1b37868a5f115c2 – songyuanyao Dec 25 '20 at 08:53
  • 1
    Try much longer strings, not short ones like "hello" and "hi". – PaulMcKenzie Dec 25 '20 at 08:58
  • @songyuanyao Why, there is no difference except that your string is a little longer. If add it longer then yours (https://coliru.stacked-crooked.com/a/a59b3e13d454695c), it ouputs "allocate" more.It really confuses me. – John Dec 25 '20 at 08:59
  • 1
    *It really confuses me.* -- What is confusing? If the string is short, then `std::string` is smart enough to **not** call the allocator. Instead, an internal character array is used. Do you think it's smart to call the allocator if the string was tiny? – PaulMcKenzie Dec 25 '20 at 09:01
  • @John [SSO](https://riptutorial.com/cplusplus/example/31654/small-object-optimization) is meant to use stack space as a buffer instead of some allocated memory in case the content is small enough to fit within the reserved space. – songyuanyao Dec 25 '20 at 09:02
  • @PaulMcKenzie What magic is with the length of the string? I really think it needs to allocate memory no matter how long the string is. – John Dec 25 '20 at 09:02
  • 1
    @John -- Why call the allocator for small strings? Don't you think there is overhead doing all of that work just for a string that's a few bytes in length? The creator of `std::string` are smart to know that going out and calling the allocator is a waste of precious time for small strings. So internally, the `std::string` class has a character buffer instead. Only when the string cannot fit into the buffer does the allocator come into play. – PaulMcKenzie Dec 25 '20 at 09:03
  • 1
    @John BTW, the technique of using a regular array for a small number of items, and then switching over to a dynamically allocated array if the number of items go over a certain size, is a well-known optimization technique. It is used not only for strings, but in other situations. – PaulMcKenzie Dec 25 '20 at 09:08
  • @PaulMcKenzie I see. Thank you so much. One more question, does the C++ standard require that short string should not call allocator? "Other situations"?Could you please name some for me to study? – John Dec 25 '20 at 09:10
  • SSO is not required. A dumb `std::string` implementation might not use it, so it isn't guaranteed. The string example should be enough to understand why SSO is used -- if you have a small amount of data, and 98% of your customers use that small amount of data, then an array can be used. For those outlying 2% of the customers, the code switches to dynamically allocating the array. In other words, the code only runs slower for those 2%, while optimum speed is kept for the 98%. – PaulMcKenzie Dec 25 '20 at 09:20
  • One example is `std::function` that keeps (or could keep) a functor inside itself if that functor is small enough. – Evg Dec 25 '20 at 09:21
  • @PaulMcKenzie Till now. I think I finally know what's short for "SSO". It's short for "short string optimum ". Thank you so much. My understanding of this question is at a different level with your generous help. – John Dec 25 '20 at 09:39
  • [Meaning of acronym SSO in the context of std::string](https://stackoverflow.com/questions/10315041/meaning-of-acronym-sso-in-the-context-of-stdstring) – Evg Dec 25 '20 at 09:47
  • @PaulMcKenzie SSO is practically required in the case of empty strings, since the default constructor is noexcept (that is, it does not allocate) but an empty string is required to have a null terminator, which the user can overwrite. The alternative would be for every empty string object to point to the same null character (a static data member, presumably) but that would be complicated and fragile. – ecatmur Dec 25 '20 at 13:01

0 Answers0