Background: I'm new to C++. I have a std::map
and am trying to search for elements by key.
Problem: Performance. The map::find()
function slows down when the map gets big.
Preferred approach: I often know roughly where in the map the element should be; I can provide a [first,last)
range to search in. This range is always small w.r.t. the number of elements in the map. I'm interested in writing a short binary search utility function with boundary hinting.
Attempt: I stole the below function from https://en.cppreference.com/w/cpp/algorithm/lower_bound and did some rough benchmarks. This function seems to be much slower than map::find()
for maps large and small, regardless of the size or position of the range hint provided. I replaced the comparison statements (it->first < value
) with a comparison of random int
s and the slowdown appeared to resolve, so I think the slowdown may be caused by the dereferencing of it->first
.
Question: Is the dereferencing the issue? Or is there some kind of unnecessary copy/move action going on? I think I remember reading that maps don't store their element nodes sequentially in memory, so am I just getting a bunch of cache misses? What is the likely cause of the slowdown, and how would I go about fixing it?
/* @param first Iterator pointing to the first element of the map to search.
* @param distance Number of map elements in the range to search.
* @param key Map key to search for. NOTE: Type validation is not a concern just yet.
*/
template<class ForwardIt, class T>
ForwardIt binary_search_map (ForwardIt& first, const int distance, const T& key) {
ForwardIt it = first;
typename std::iterator_traits<ForwardIt>::difference_type count, step;
count = distance;
while (count > 0) {
it = first;
step = count/2;
std::advance(it, step);
if (it->first < value) {
first = ++it;
count -= step + 1;
}
else if (it->first > value)
count = step;
else {
first = it;
break;
}
}
return first;
}