This gave me an excuse to play more with allocators. I opted for the polymorphic allocators - although that's only tangentially related¹.
Aside: The relation is that with custom allocators you often want to propagate the allocator to nested types that are allocator-aware. See "Advanced" below
Sample Element Type
struct X {
std::string key, value;
};
It doesn't get much simpler, although it allows us to experiment with sharing
the allocator with the nested strings later.
Tracing Memory Resource
Let's create a tracing memory resource. That's pretty straight forward, and we'll just forward to standard new
/delete
:
namespace pmr = boost::container::pmr;
struct tracing_resource : pmr::memory_resource {
uint64_t n = 0, total_bytes = 0;
virtual void* do_allocate(std::size_t bytes, std::size_t alignment) override {
n += 1;
total_bytes += bytes;
return pmr::new_delete_resource()->allocate(bytes, alignment);
}
virtual void do_deallocate(void* p, std::size_t bytes, std::size_t alignment) override {
if (p) {
n -= 1;
total_bytes -= bytes;
}
return pmr::new_delete_resource()->deallocate(p, bytes, alignment);
}
virtual bool do_is_equal(const memory_resource& other) const noexcept override {
return pmr::new_delete_resource()->is_equal(other);
}
};
We can inspect n
(number of allocation) as well as the total bytes allocated at various points throughout our test code.
A Test Routine
Let's put it together in main
, starting with our tracer:
tracing_resource tracer;
Let's mount a pooled resource on that:
pmr::unsynchronized_pool_resource res(&tracer);
auto allocations = [&] {
fmt::print("alloc: #{}, {} bytes, cached = {}\n", tracer.n, tracer.total_bytes, cache_buckets(res));
};
allocations();
This will print
alloc: #0, 0 bytes, cached = {0, 0, 0, 0, 0, 0, 0, 0, 0}
right out of the gate.
Now, let's start (re)allocating some deques in various patterns:
pmr::deque<X> collection(&res);
auto report = [&] {
fmt::print("collection = {}\nalloc: #{}, {} bytes, cached = {}\n", collection, tracer.n, tracer.total_bytes, cache_buckets(res));
};
std::vector data1 { X{"1", "eins"}, {"2", "zwei"}, {"3", "drei"}, };
std::vector data2 { X{"4", "vier"}, {"5", "fuenf"}, {"6", "sechs"}, };
std::vector data3 { X{"7", "sieben"}, {"8", "acht"}, {"9", "neun"}, };
auto i = 0;
for (auto const& data : {data1, data2, data3}) {
for (auto el : data) {
(i%2)
? collection.push_back(el)
: collection.push_front(el);
}
report();
collection.clear();
report();
}
This will append varying sequences at varying ends of the container. We won't
do much mutation, as this will become interesting when we make the strings also
use the pooled resource).
Live Demo
Live On Compiler Explorer
#include <boost/container/pmr/deque.hpp>
#include <boost/container/pmr/unsynchronized_pool_resource.hpp>
// debug output
#include <range/v3/all.hpp>
#include <fmt/ranges.h>
#include <fmt/ostream.h>
#include <iomanip>
namespace pmr = boost::container::pmr;
struct tracing_resource : pmr::memory_resource {
uint64_t n = 0, total_bytes = 0;
virtual void* do_allocate(std::size_t bytes, std::size_t alignment) override {
n += 1;
total_bytes += bytes;
return pmr::new_delete_resource()->allocate(bytes, alignment);
}
virtual void do_deallocate(void* p, std::size_t bytes, std::size_t alignment) override {
if (p) {
n -= 1;
total_bytes -= bytes;
}
return pmr::new_delete_resource()->deallocate(p, bytes, alignment);
}
virtual bool do_is_equal(const memory_resource& other) const noexcept override {
return pmr::new_delete_resource()->is_equal(other);
}
};
struct X {
std::string key, value;
friend std::ostream& operator<<(std::ostream& os, X const& x) {
return os << "(" << std::quoted(x.key) << ", " << std::quoted(x.value) << ")";
}
};
auto cache_buckets(pmr::unsynchronized_pool_resource& res) {
using namespace ::ranges;
return views::iota(0ull)
| views::take_exactly(res.pool_count())
| views::transform([&](auto idx) {
return res.pool_cached_blocks(idx);
});
}
int main() {
tracing_resource tracer;
{
pmr::unsynchronized_pool_resource res(&tracer);
auto allocations = [&] {
fmt::print("alloc: #{}, {} bytes, cached = {}\n", tracer.n, tracer.total_bytes, cache_buckets(res));
};
allocations();
{
pmr::deque<X> collection(&res);
auto report = [&] {
fmt::print("collection = {}\nalloc: #{}, {} bytes, cached = {}\n", collection, tracer.n, tracer.total_bytes, cache_buckets(res));
};
std::vector data1 { X{"1", "eins"}, {"2", "zwei"}, {"3", "drei"}, };
std::vector data2 { X{"4", "vier"}, {"5", "fuenf"}, {"6", "sechs"}, };
std::vector data3 { X{"7", "sieben"}, {"8", "acht"}, {"9", "neun"}, };
auto i = 0;
for (auto const& data : {data1, data2, data3}) {
for (auto el : data) {
(i%2)
? collection.push_back(el)
: collection.push_front(el);
}
report();
collection.clear();
report();
}
}
allocations();
}
fmt::print("alloc: #{}, {} bytes\n", tracer.n, tracer.total_bytes);
}
Prints
alloc: #0, 0 bytes, cached = {0, 0, 0, 0, 0, 0, 0, 0, 0}
collection = {("3", "drei"), ("2", "zwei"), ("1", "eins")}
alloc: #4, 1864 bytes, cached = {0, 0, 0, 0, 0, 1, 0, 0, 0}
collection = {}
alloc: #4, 1864 bytes, cached = {0, 0, 0, 0, 0, 2, 0, 0, 0}
collection = {("6", "sechs"), ("5", "fuenf"), ("4", "vier")}
alloc: #4, 1864 bytes, cached = {0, 0, 0, 0, 0, 2, 0, 0, 0}
collection = {}
alloc: #4, 1864 bytes, cached = {0, 0, 0, 0, 0, 2, 0, 0, 0}
collection = {("9", "neun"), ("8", "acht"), ("7", "sieben")}
alloc: #4, 1864 bytes, cached = {0, 0, 0, 0, 0, 2, 0, 0, 0}
collection = {}
alloc: #4, 1864 bytes, cached = {0, 0, 0, 0, 0, 2, 0, 0, 0}
alloc: #4, 1864 bytes, cached = {0, 0, 1, 0, 0, 3, 0, 0, 0}
alloc: #0, 0 bytes
Advanced
As promised, we can make the element type allocator aware and show propagation:
#include <boost/container/pmr/string.hpp>
// ...
struct X {
using allocator_type = pmr::polymorphic_allocator<X>;
template<typename K, typename V>
explicit X(K&& key, V&& value, allocator_type a = {})
: key(std::forward<K>(key), a), value(std::forward<V>(value), a) {}
pmr::string key, value;
};
Let's amend the test driver to emplace the elements from string literals:
std::vector data1 { std::pair{"1", "eins"}, {"2", "zwei"}, {"3", "drei"}, };
std::vector data2 { std::pair{"4", "vier"}, {"5", "fuenf"}, {"6", "sechs"}, };
std::vector data3 { std::pair{"7", "sieben"}, {"8", "acht"}, {"9", "neun"}, };
auto i = 0;
for (auto const& data : {data1, data2, data3}) {
for (auto [k,v] : data) {
(i%2)
? collection.emplace_back(k, v)
: collection.emplace_front(k, v);
}
For good measure, let's also mutate one of the nested string values:
collection.at(1).value.append(50, '*'); // thwart SSO
report();
collection.at(1).value = "sept";
report();
Again, demoing Live On Compiler Explorer
#include <boost/container/pmr/deque.hpp>
#include <boost/container/pmr/string.hpp>
#include <boost/container/pmr/unsynchronized_pool_resource.hpp>
// debug output
#include <range/v3/all.hpp>
#include <fmt/ranges.h>
#include <fmt/ostream.h>
#include <iomanip>
namespace pmr = boost::container::pmr;
struct tracing_resource : pmr::memory_resource {
uint64_t n = 0, total_bytes = 0;
virtual void* do_allocate(std::size_t bytes, std::size_t alignment) override {
n += 1;
total_bytes += bytes;
return pmr::new_delete_resource()->allocate(bytes, alignment);
}
virtual void do_deallocate(void* p, std::size_t bytes, std::size_t alignment) override {
if (p) {
n -= 1;
total_bytes -= bytes;
}
return pmr::new_delete_resource()->deallocate(p, bytes, alignment);
}
virtual bool do_is_equal(const memory_resource& other) const noexcept override {
return pmr::new_delete_resource()->is_equal(other);
}
};
struct X {
using allocator_type = pmr::polymorphic_allocator<X>;
template<typename K, typename V>
explicit X(K&& key, V&& value, allocator_type a = {})
: key(std::forward<K>(key), a), value(std::forward<V>(value), a) {}
pmr::string key, value;
friend std::ostream& operator<<(std::ostream& os, X const& x) {
return os << "(" << std::quoted(x.key.c_str()) << ", " << std::quoted(x.value.c_str()) << ")";
}
};
auto cache_buckets(pmr::unsynchronized_pool_resource& res) {
using namespace ::ranges;
return views::iota(0ull)
| views::take_exactly(res.pool_count())
| views::transform([&](auto idx) {
return res.pool_cached_blocks(idx);
});
}
int main() {
tracing_resource tracer;
{
pmr::unsynchronized_pool_resource res(&tracer);
auto allocations = [&] {
fmt::print("alloc: #{}, {} bytes, cached = {}\n", tracer.n, tracer.total_bytes, cache_buckets(res));
};
allocations();
{
pmr::deque<X> collection(&res);
auto report = [&] {
fmt::print("collection = {}\nalloc: #{}, {} bytes, cached = {}\n", collection, tracer.n, tracer.total_bytes, cache_buckets(res));
};
std::vector data1 { std::pair{"1", "eins"}, {"2", "zwei"}, {"3", "drei"}, };
std::vector data2 { std::pair{"4", "vier"}, {"5", "fuenf"}, {"6", "sechs"}, };
std::vector data3 { std::pair{"7", "sieben"}, {"8", "acht"}, {"9", "neun"}, };
auto i = 0;
for (auto const& data : {data1, data2, data3}) {
for (auto [k,v] : data) {
(i%2)
? collection.emplace_back(k, v)
: collection.emplace_front(k, v);
}
report();
collection.at(1).value.append(50, '*'); // thwart SSO
report();
collection.at(1).value = "sept";
report();
collection.clear();
report();
}
}
allocations();
}
fmt::print("alloc: #{}, {} bytes\n", tracer.n, tracer.total_bytes);
}
Prints:
alloc: #0, 0 bytes, cached = {0, 0, 0, 0, 0, 0, 0, 0, 0}
collection = {("3", "drei"), ("2", "zwei"), ("1", "eins")}
alloc: #4, 1864 bytes, cached = {0, 0, 0, 0, 0, 1, 0, 0, 0}
collection = {("3", "drei"), ("2", "zwei**************************************************"), ("1", "eins")}
alloc: #5, 2008 bytes, cached = {0, 0, 0, 0, 0, 1, 0, 0, 0}
collection = {("3", "drei"), ("2", "sept"), ("1", "eins")}
alloc: #5, 2008 bytes, cached = {0, 0, 0, 0, 0, 1, 0, 0, 0}
collection = {}
alloc: #5, 2008 bytes, cached = {0, 0, 0, 1, 0, 2, 0, 0, 0}
collection = {("6", "sechs"), ("5", "fuenf"), ("4", "vier")}
alloc: #5, 2008 bytes, cached = {0, 0, 0, 1, 0, 2, 0, 0, 0}
collection = {("6", "sechs"), ("5", "fuenf**************************************************"), ("4", "vier")}
alloc: #5, 2008 bytes, cached = {0, 0, 0, 0, 0, 2, 0, 0, 0}
collection = {("6", "sechs"), ("5", "sept"), ("4", "vier")}
alloc: #5, 2008 bytes, cached = {0, 0, 0, 0, 0, 2, 0, 0, 0}
collection = {}
alloc: #5, 2008 bytes, cached = {0, 0, 0, 1, 0, 2, 0, 0, 0}
collection = {("9", "neun"), ("8", "acht"), ("7", "sieben")}
alloc: #5, 2008 bytes, cached = {0, 0, 0, 1, 0, 1, 0, 0, 0}
collection = {("9", "neun"), ("8", "acht**************************************************"), ("7", "sieben")}
alloc: #5, 2008 bytes, cached = {0, 0, 0, 0, 0, 1, 0, 0, 0}
collection = {("9", "neun"), ("8", "sept"), ("7", "sieben")}
alloc: #5, 2008 bytes, cached = {0, 0, 0, 0, 0, 1, 0, 0, 0}
collection = {}
alloc: #5, 2008 bytes, cached = {0, 0, 0, 1, 0, 2, 0, 0, 0}
alloc: #5, 2008 bytes, cached = {0, 0, 1, 1, 0, 3, 0, 0, 0}
alloc: #0, 0 bytes
Conclusions
While I chose poymorphic allocators because they support scoped_allocator_adaptor<>
-style propagation by default, all of the above can be created with statically typed allocators.
It does demonstrate that if you use a pool allocator, the deque behaviour becomes pooled.
Aside: Pool allocators exist that are able to forgo cleanup, which can be valid in some scenarios, e.g. where the entire memory pool lives on the stack anyways. This is a common allocation optimization technique, allowing large numbers of deallocations/destructions to be skipped.
Addressing some more points of your list:
Q. How can I specify an allocator of blocks for boost::container::deque
(blocks, not items!)?
A. I think the allocator is intrinsically always called for blocks,
because of how deques work.
Q. If there is a way, then will such specification support allocators
with state?
A. Standard library as well as Boost Container should all support
stateful allocators these days. When in doubt, Boost Container has your
back.
Q. If yes, then will the above-mentioned allocator go for it?
A. I didn't look closely at it, but you may put it in the same test-bed
to find out
Q. After all, if not in this way, how can I make a deque that will not
deallocate at least one of its freed blocks and will reuse it later when a
new block will be needed?
A. See above. I'm not sure I understand the exact intent of the "at
least one" clause, but I did note that Boost's deque implementation does do
a "private map" allocation — presumably for some block accounting overhead
— that stays around until the deque object is destroyed. This allocation
doesn't happen at (default) construction, but later.