Vector of unordered_maps, searching in maps too slow

Question

I wrote a small program that creates a vector of two million maps with some sample data and then queries for some values.

I know I could use a database at this point, but I'm just playing around to get a little bit into performance optimization.

The Code:

#include <iostream>
#include <vector>
#include <unordered_map>
#include <map>
#include <string>
#include <chrono>

using namespace std;

static int NUM_OF_MAPS = 2 * 1000 * 1000;
void buildVector(vector<unordered_map <string, int>> &maps);
void find(string key, int value, vector<unordered_map <string, int>> &maps);

int main() {
    auto startPrg = chrono::steady_clock::now();

    vector<unordered_map <string, int>> maps;
    buildVector(maps);

    for (int i = 0; i < 10; i++) {
        string s(1, 'a'+ i);
        find(s, i, maps);
    }

    auto endPrg = chrono::steady_clock::now();
    cout << "program duration: " << chrono::duration_cast<chrono::microseconds>(endPrg - startPrg).count() / 1000.0 << " ms" << endl;
    return 0;
}

void find(string key, int value, vector<unordered_map <string, int>> &maps) {
    auto start = chrono::steady_clock::now();

    int matches = 0;
    for (unordered_map <string, int> &map : maps) {
        unordered_map<string,int>::const_iterator got = map.find(key);

        if (got != map.end() && got->second == value) {
            matches++;
        }
    }

    auto end = chrono::steady_clock::now();
    cout << matches << " matches for " << key << " = " << value << " in " << chrono::duration_cast<chrono::microseconds>(end - start).count() / 1000.0 << " ms" << endl;
}

void buildVector(vector<unordered_map <string, int>> &maps) {
    auto start = chrono::steady_clock::now();
    maps.reserve(NUM_OF_MAPS);

    int entryCounter = 0;
    unordered_map <string, int> map;
    for (int i = 0; i < NUM_OF_MAPS; i++) {
        map["a"] = entryCounter++;
        map["b"] = entryCounter++;
        map["c"] = entryCounter++;
        map["d"] = entryCounter++;
        map["e"] = entryCounter++;
        map["f"] = entryCounter++;
        maps.push_back(map);
        entryCounter %= 100;
    }

    auto end = chrono::steady_clock::now();
    cout << "build vector: " << chrono::duration_cast<chrono::microseconds>(end - start).count() / 1000.0 << " ms (" << maps.size() << ")" << endl;
}

Output:

build vector: 697.381 ms (2000000)
40000 matches for a = 0 in 67.873 ms
40000 matches for b = 1 in 64.176 ms
40000 matches for c = 2 in 60.484 ms
40000 matches for d = 3 in 68.102 ms
40000 matches for e = 4 in 62.71 ms
40000 matches for f = 5 in 65.723 ms
0 matches for g = 6 in 64.407 ms
0 matches for h = 7 in 45.401 ms
0 matches for i = 8 in 65.307 ms
0 matches for j = 9 in 64.371 ms
program duration: 1326.42 ms

I did the same in Java just for comparison of the speed and got the following result:

build vector: 2536.971578 ms (2000000)
40000 matches for a = 0 in 59.293339 ms
40000 matches for b = 1 in 56.306123 ms
40000 matches for c = 2 in 53.503208 ms
40000 matches for d = 3 in 51.174979 ms
40000 matches for e = 4 in 50.967731 ms
40000 matches for f = 5 in 53.68969 ms
0 matches for g = 6 in 41.927401 ms
0 matches for h = 7 in 36.160645 ms
0 matches for i = 8 in 33.535616 ms
0 matches for j = 9 in 36.56883 ms
program duration: 3016.979919 ms

While C++ is much faster for creating the data, it's very slow in the query part (compared to Java). Is there any way for C++ to also beat Java in that part?

Java Code:

static int NUM_OF_MAPS = 2 * 1000 * 1000;

public static void run() {
    long startPrg = System.nanoTime();

    List<Map<String,Integer>> maps = new ArrayList<>(NUM_OF_MAPS);
    buildVector(maps);

    for (int i = 0; i < 10; i++) {
        String s = String.valueOf((char)('a' + i));
        find(s, i, maps);
    }

    long endPrg = System.nanoTime();
    System.out.println("program duration: " + (endPrg - startPrg) / 1000000.0 + " ms");
}


static void find(String key, Integer value, List<Map<String,Integer>> maps) {
    long start = System.nanoTime();

    int matches = 0;
    for (Map<String,Integer> map : maps) {
        Integer got = map.get(key);

        if (got != null && got.equals(value)) {
            matches++;
        }
    }

    long end = System.nanoTime();
    System.out.println(matches + " matches for " + key + " = " + value + " in " + (end - start) / 1000000.0 + " ms");
}

static void buildVector(List<Map<String,Integer>> maps) {
    long start = System.nanoTime();

    int entryCounter = 0;
    Map<String,Integer> map = new HashMap<>();
    for (int i = 0; i < NUM_OF_MAPS; i++) {
        map.put("a", entryCounter++);
        map.put("b", entryCounter++);
        map.put("c", entryCounter++);
        map.put("d", entryCounter++);
        map.put("e", entryCounter++);
        map.put("f", entryCounter++);
        maps.add(new HashMap<>(map));
        entryCounter %= 100;
    }

    long end = System.nanoTime();
    System.out.println("build vector: " + (end - start) / 1000000.0 + " ms (" + maps.size() + ")");
}

Edit: Sry copied the Java code twice instead of the C++ code.

There are many reasons why performance varies between languages, but C++ in particular allows you to optimise in many ways - e.g. for safety, speed or space. It would be useful to know the compiler/flags you're using for your benchmark — Josh Greifer, May 07 '19 at 15:22
g++ -O3 -Wall -c -fmessage-length=0 -MMD -MP -MF"src/Benches.d" -MT"src/Benches.o" -o "src/Benches.o" "../src/Benches.cpp" — ue7m, May 07 '19 at 15:26
`find` unnecessarily copies a `std::string`. Not sure if that has an impact on performance. — nwp, May 07 '19 at 15:32
[This answer](https://stackoverflow.com/a/42588384/1548468) seems to point out that `std::unordered_map` is not always as good as it can be. — Botje, May 07 '19 at 15:34
I wonder if optimizing for size (`-Os`) would worsen or improve the results. It could happen that optimizing for size would lower number of cache-misses. — Yksisarvinen, May 07 '19 at 15:36
@nwp I doubt it, as there are only 10 calls to `find`, with small strings — Caleth, May 07 '19 at 15:38
@Caleth actually find is called 20 mil times (10 loops over 2 mil maps). @ UmNyobe gcc 8.1.0 but g++ 7.4.0 — ue7m, May 07 '19 at 15:42
@ue7m `::find`, which takes `std::string` is called 10 times. `std::map::find`m which takes `const std::string &` is called 20 million times — Caleth, May 07 '19 at 15:46

UmNyobe · Accepted Answer · 2019-05-07T16:03:12.323

5

The c++ code is not too slow. The java code is better optimized hashwise.

In c++, it is unordered_map which is responsible for computing the hash. So each container in your collection will hash the string during unordered_map<string,int>::const_iterator got = map.find(key).
In java, the HashMap relies on the hashCode method of object. Thing is, String class can compute the hash only at initialization and when the string is modified.

In terms of hash(string) -> int computations, your find method in c++ is O(NUM_OF_MAPS), while in java it is O(1).

edited May 07 '19 at 16:03

answered May 07 '19 at 15:57

UmNyobe

22,539
9
61
90

ps : add a prefix to each string and note the difference. c++ timings for find will worsen, while the one for java will not change much. – UmNyobe May 07 '19 at 16:15
1

It might be interesting to add that C++20 will add overloads to `find()` that allow passing a pre-computed hash. Thus the `for` loop in OP's `find` will be able to compute the hash once per string. – spectras May 07 '19 at 16:17
@UmNyobe That's might be it. Did some tests with other map implementations (like robin_map) and got better results (also better than the java ones). – ue7m May 08 '19 at 04:08

score 0 · Answer 2 · answered May 07 '19 at 16:23

To add to UmNyobe's answer you could improve the performance by creating your own string type which caches the calculated hash values:

class hashed_string : public std::string
{
public:
  hashed_string( const std::string& str )
  : std::string( str ), hash( std::hash( str ) )
  {
  }

  size_t getHash() { return hash; }

private:
  size_t hash;
};

namespace std
{
    template<> struct hash< hashed_string >
    {
        typedef hashed_string argument_type;
        typedef std::size_t result_type;
        result_type operator()(argument_type const& s) const noexcept
        {
          return s.getHash();
        }
    };
}

You would need to expand the implementation of hashed_string to prevent modification of the underlying string or recalculate the hash when the string is modified. This might be easier to implement by making the string a member rather than a base class.

That's completely broken: you allow manipulating `hashed_string` through a `std::string` reference, but `std::string` has no virtual methods. **Don't inherit std types**, wrap them instead, like you suggest in the very last sentence of your answer. — spectras, May 07 '19 at 16:27

Vector of unordered_maps, searching in maps too slow

2 Answers2