0

I can't understand logic. I have two containers with same element count: vector and unordered_map.

I'm using function that checks time complexity of some function and returns value in milliseconds

auto funcComplexity = [](auto&& func, auto&&... params)
{
    const auto& start = high_resolution_clock::now();
    for (auto i = 0; i < 100; ++i) 
    {
        // function invocation using perfect forwarding
        std::forward<decltype(func)>(func)(std::forward<decltype(params)>(params)...);
    }
    const auto& stop = high_resolution_clock::now();
    
    return duration_cast<duration<double, std::milli>>(stop - start).count();
};

When I erase an element from the middle of vector, it actually gives me less time than erasing element from unordered_map

void iterateUMap(std::unordered_map<int, Test> map)
{
    map.erase(500);
}

void iterateVector(std::vector<Test> vector)
{
    std::remove_if(vector.begin(), vector.end(), [](Test& val)
        {
            return val.mId == 500;
        });
}

int main()
{
    auto funcComplexity = [](auto&& func, auto&&... params)
    {
        const auto& start = high_resolution_clock::now();
        for (auto i = 0; i < 10000; ++i) 
        {
            // function invocation using perfect forwarding
            std::forward<decltype(func)>(func)(std::forward<decltype(params)>(params)...);
        }
        const auto& stop = high_resolution_clock::now();
    
        return duration_cast<duration<double, std::milli>>(stop - start).count();
    };

    std::unordered_map<int, Test> uMap;

    for(int i = 0; i < 1000; i++)
    {
        uMap.try_emplace(i, Test(i));
    }

    std::vector<Test> vector;
    vector.reserve(1000);
    for (int i = 0; i < 1000; i++)
    {
        vector.emplace_back(i);
    }

    cout << funcComplexity(iterateUMap, uMap) << endl;

    cout << endl;

    cout << funcComplexity(iterateVector, vector) << endl;
      
    return 0;
}

So the output of those two functions:

For vector is: 52.6565 milliseconds

For U_Map is : 6740.64 milliseconds

Why do erasing from u_map is actually slower than erasing from vector? The same thing happening when I just want to get element. For example:

void iterateUMap(std::unordered_map<int, Test> map)
{
    map[500].DoSomething();
}

void iterateVector(std::vector<Test> vector)
{
    for (auto& value : vector)
    {
        if(value.mId == 500)
        {
            value.DoSomething();
            break;
        }
    }
}
  • 5
    "*I'm using function that checks time complexity of some function and returns value in milliseconds*" That's not how time complexity works. Time complexity is not measured in milliseconds; it's an asymptotic limit. It is a representation of the number of operations performed as N goes to infinity. It represents orders of magnitude, not absolute performance. – Nicol Bolas Jan 14 '23 at 23:39
  • 1
    Try it with a larger vector / map. 1000 elements / entries is pretty small. – Paul Sanders Jan 15 '23 at 00:10
  • 3
    Sidenote, [`remove_if` doesn't remove anything from list it just moves them to end](https://stackoverflow.com/questions/4478636/stdremove-if-lambda-not-removing-anything-from-the-collection) – Ranoiaetep Jan 15 '23 at 01:00
  • 3
    @Ranoiaetep: Technically, it doesn't move them to the end; it moves them *from* the end, overwriting the original. What is left at the end is in a moved-from state (assuming they're moveable; otherwise they're copied). – Nicol Bolas Jan 15 '23 at 01:13
  • Related (but does not go into deletion): [Why is vector faster than unordered_map?](https://stackoverflow.com/questions/55451825/) – JaMiT Jan 15 '23 at 03:16
  • Your `funcComplexity()` function measures time, not time complexity. To measure time complexity, you would need time measurements for various container sizes (think big, like covering sizes from `100` to at least `1,000,000`). Then you would need a collection of candidate complexity functions and determine which candidate is the best fit to the empirical data. *This process is inexact. You would probably be better off determining the growth function theoretically, then using the time measurements to verify the theoretical result to a reasonable degree of certainty.* – JaMiT Jan 15 '23 at 03:23

1 Answers1

5

You've just discovered one of the secrets about modern computer hardware.

Modern CPUs are very, very good at accessing memory linearly, you know, like how data in vectors happens get stored. Accessing X linear memory addresses is several orders of magnitude faster than access X random memory addresses; and X does not have to become very big before this is observed.

This is because, in general, memory gets retrieved from RAM in comparatively large blocks, and then stored in fast, on-CPU cache. Large blocks memory blocks get cached, accessing the next byte or word will then hit the fast on-CPU cache, and not result in a comparatively slow RAM access. And even before the next byte or word gets accessed, while the CPU is busy digesting the first couple of bytes a different part of CPU will proactively fetch the next cache line, anticipating future memory accesses.

Sam Varshavchik
  • 114,536
  • 5
  • 94
  • 148
  • So in any case I should rely on Big O notation? Is there any tools that could help to check time complexity of my C++ codes? – Windings-Lab Jan 14 '23 at 23:50
  • 2
    @Windings-Lab The way you test the performance of your code is with a profiler. – Stephen Newell Jan 15 '23 at 00:02
  • 1
    @Windings-Lab: "*Is there any tools that could help to check time complexity of my C++ codes?*" Again that's *not how that works*. Time complexity is a measure of an algorithm, not of "my C++ codes". *Performance* is a measure of what "my C++ codes" do. Time complexity is *useful* as a tool for deciding on what algorithms to use. But it should never be treated as the *only* determining factor when making performance decisions. – Nicol Bolas Jan 15 '23 at 00:35