Suppose I am using std::unordered_map<std::string, Foo>
in my code. It's nice and convenient, but unfortunately every time I want to do a lookup (find()
) in this map I have to come up with an instance of std::string
.
For instance, let's say I'm tokenizing some other string and want to call find()
on every token. This forces me to construct an std::string
around every token before looking it up, which requires an allocator (std::allocator
, which amounts to a CRT malloc()
). This can easily be slower than the actual lookup itself. It also contends with other threads since heap management requires some form of synchronization.
A few years ago I found the Boost.intrusive library; it was just a beta version back then. The interesting thing was it had a container called boost::intrusive::iunordered_set
which allowed code to perform lookups with any user-supplied type.
I'll explain it how I'd like it to work:
struct immutable_string
{
const char *pf, *pl;
struct equals
{
bool operator()(const string& left, immutable_string& right) const
{
if (left.length() != right.pl - right.pf)
return false;
return std::equals(right.pf, right.pl, left.begin());
}
};
struct hasher
{
size_t operator()(const immutable_string& s) const
{
return boost::hash_range(s.pf, s.pl);
}
};
};
struct string_hasher
{
size_t operator()(const std::string& s) const
{
return boost::hash_range(s.begin(), s.end());
}
};
std::unordered_map<std::string, Foo, string_hasher> m;
m["abc"] = Foo(123);
immutable_string token; // token refers to a substring inside some other string
auto it = m.find(token, immutable_string::equals(), immutable_string::hasher());
Another thing would be to speed up the "find and insert if not found" use case—the trick with lower_bound()
only works for ordered containers. The intrusive container has methods called insert_check()
and insert_commit()
, but that's for a separate topic I guess.