3

I wrote a custom allocator for std::string and std::vector as follows:

#include <cstdint>
#include <iterator>
#include <iostream>

template <typename T>
struct PSAllocator
{
    typedef std::size_t size_type;
    typedef std::ptrdiff_t difference_type;
    typedef T* pointer;
    typedef const T* const_pointer;
    typedef T& reference;
    typedef const T& const_reference;
    typedef T value_type;

    template<typename U>
    struct rebind {typedef PSAllocator<U> other;};

    PSAllocator() throw() {};
    PSAllocator(const PSAllocator& other) throw() {};

    template<typename U>
    PSAllocator(const PSAllocator<U>& other) throw() {};

    template<typename U>
    PSAllocator& operator = (const PSAllocator<U>& other) { return *this; }
    PSAllocator<T>& operator = (const PSAllocator& other) { return *this; }
    ~PSAllocator() {}


    pointer allocate(size_type n, const void* hint = 0)
    {
        std::int32_t* data_ptr = reinterpret_cast<std::int32_t*>(::operator new(n * sizeof(value_type)));
        std::cout<<"Allocated: "<<&data_ptr[0]<<" of size: "<<n<<"\n";
        return reinterpret_cast<pointer>(&data_ptr[0]);
    }

    void deallocate(T* ptr, size_type n)
    {
        std::int32_t* data_ptr = reinterpret_cast<std::int32_t*>(ptr);
        std::cout<<"De-Allocated: "<<&data_ptr[0]<<" of size: "<<n<<"\n";
        ::operator delete(reinterpret_cast<T*>(&data_ptr[0]));
    }
};

Then I ran the following test case:

int main()
{
    typedef std::basic_string<char, std::char_traits<char>, PSAllocator<char>> cstring;

    cstring* str = new cstring();
    str->resize(1);
    delete str;

    std::cout<<"\n\n\n\n";

    typedef std::vector<char, PSAllocator<char>> cvector;

    cvector* cv = new cvector();
    cv->resize(1);
    delete cv;
}

For whatever odd reason, it goes on to print:

Allocated: 0x3560a0 of size: 25
Allocated: 0x3560d0 of size: 26
De-Allocated: 0x3560a0 of size: 25
De-Allocated: 0x3560d0 of size: 26




Allocated: 0x351890 of size: 1
De-Allocated: 0x351890 of size: 1

So why does it allocate twice for std::string and a lot more bytes?

I'm using g++ 4.8.1 x64 sjlj on Windows 8 from: http://sourceforge.net/projects/mingwbuilds/.

Brandon
  • 22,723
  • 11
  • 93
  • 186
  • 1
    This is highly dependent on the implementation that you are using, you should provide what library implementation and version (usually compiler + version if you are not specifying a different than the default implementation) – David Rodríguez - dribeas Feb 24 '14 at 05:12
  • 1
    'new cstring()' invoke allocator once, and 'str->resize(1)' invoke another one. – yinqiwen Feb 24 '14 at 05:13
  • I agree to what @DavidRodríguez-dribeas said. Also, [GCC, or at least stdlibc++, doesn't have that behavior](http://coliru.stacked-crooked.com/a/6510933db2509352). – Mark Garcia Feb 24 '14 at 05:13
  • @yinqiwen: `new cstring()` does not allocate from the `PSAllocator`. There are two allocations, but only one should show in the output, the other is global `new`. – David Rodríguez - dribeas Feb 24 '14 at 05:14
  • Doesn't seem to happen on ideone either. I included information about my compiler and where I got it from. Still happens :S – Brandon Feb 24 '14 at 05:17
  • 2
    My advice is that you look run the program inside a debugger and verify where the allocations are coming from. – David Rodríguez - dribeas Feb 24 '14 at 05:17
  • Cannot find why it is doing this in gdb that comes with codeblocks. – Brandon Feb 24 '14 at 05:39
  • 1
    Is it your intention to build buffer overflows into your allocator by offsetting the pointer returned in `allocate` by 8 bytes? – Casey Feb 24 '14 at 05:56
  • Oh I was using the allocator for something else before posting it here. I was allocating `(n * sizeof(value_type)) + (sizeof(int) * 2)` so it wasn't a buffer overflow before. I fixed it on the OP. I deleted my compiler, reinstalled, no cigar. I switched to TDM 4.8.1, still the same.. – Brandon Feb 24 '14 at 06:12
  • @DavidRodríguez-dribeas I guess that 'basic_string' would allocate some heap space for sth in constructor which depend on the implementation of std::basic_string like you said. – yinqiwen Feb 24 '14 at 06:26
  • I suggest you change the `std::cout` lines to also print `typeid(T).name()` - you may see something other than `char` if your library is using the allocator to create space for some other data, giving you a hint at what's going on. For example, on VC++2008 when compiling from the command line I see `std::_Aux_cont` objects allocated (strangely the same code inside the IDE doesn't do so), but the actual `char` string data allocation doesn't happen unless I up the `resize()` value due to Short String Optimisation. – Tony Delroy Feb 24 '14 at 06:41

1 Answers1

3

I can't reproduce the double allocation, since apparently my libstdc++ does not allocate anything at all for the empty string. The resize however does allocate 26 bytes, and gdb helps me identifying how they are composed:

size_type __size = (__capacity + 1) * sizeof(_CharT) + sizeof(_Rep);
                   (     1     + 1) *     1          +     24

So the memory is mostly for this _Rep representation, which in turn consists of the following data members:

size_type    _M_length;   // 8 bytes
size_type    _M_capacity; // 8 bytes
_Atomic_word _M_refcount; // 4 bytes

I guess the last four bytes is just for the sake of alignment, but I might have missed some data element.

I guess the main reason why this _Rep structure is allocated on the heap is that it can be shared among string instances, and perhaps also that it can be avoided for empty strings as the lack of a first allocation on my system suggests.

To find out why your implementation doesn't make use of this empty string optimization, have a look at the default constructor. Its implementation seems to depend on the value of _GLIBCXX_FULLY_DYNAMIC_STRING, which apparently is non-zero in your setup. I'd not advise changing that setting directly, since it starts with an underscore and is therefore considered private. But you might find some public setting to affect this value.

MvG
  • 57,380
  • 22
  • 148
  • 276
  • In my g++ config.h it says: `/* Define to 1 if a fully dynamic basic_string is wanted, 0 to disable, undefined for platform defaults */ #define _GLIBCXX_FULLY_DYNAMIC_STRING 1` Setting it to 0 does indeed solve it! I tried using `--disable-fully-dynamic-string` in the linker settings of codeblocks but no cigar.. I entered it into the compiler options it's apparently invalid. So I changed it to 0 in the config :l Works but I'm not sure what side-effects it's going to have.. It also required my allocator to have a `==` and `!=` operator defined explicitly. – Brandon Feb 24 '14 at 14:38