Why Unordered_Map hashing is too slow then my simple array hash

Question

Well I am saying this third consecutive time that I have solved UVa 100 - The 3n + 1 problem I have first implemented simple hash by creating a array of size 1000000 and then I store result in that array, then I tried C++ Unordered Map for hashing and time difference between both of them are too huge for same input

here is my source code

I have tried pasting my code here but every time it shattered so I paste here Sorry :(

here is the time taken by 1. Simple array 2. Simple Map 3. unordered map

Screenshot of my terminal

So why there is too much time difference between unordered map and simple array. I know simple map is slow but the difference is about 3 min :(

Regarding code shattering: replace tabs with spaces and try again. — Angew is no longer proud of SO, Feb 09 '15 at 12:47
Unordered map would be much slower than arrays because it needs memory allocations. But it is significantly slower in case possibly because default hash function is colliding a lot. — Mohit Jain, Feb 09 '15 at 12:47
What does the profiler say? (My guess would be the cost of allocations and the lack of locality, but that's just a guess.) — James Kanze, Feb 09 '15 at 12:52
I have not ever used profiler in my life and I have to search about it a lot before implementing and saying about this so give me some time :( — Deepanshu, Feb 09 '15 at 13:08

score 2 · Answer 1 · answered Feb 09 '15 at 12:53

2

Given that you can use an array interchangably with a hash table, definitely the array will be a lot faster. m[n] on an array is just pointer arithmetic and a dereference, versus on an unordered_map involves calculating a hash, looking up in a bucket, and doing some equality comparisons. Also, the way you're looking up your indices gives you massive cache locality benefits in the array which will not apply in the unordered_map. I would fully expect the array to win every time - it just necessarily has to do less work.

That said, your unordered_map implementation could be improved a lot:

while(n != 1) {
    if(m1[n] != 0) {
        res += m1[n];
    }
}

That's doing two lookups for the same index twice. You should do it just once (which you do for std::map btw...):

while (n != 1) {
    auto it = m1.find(n);
    if (it != m.end()) {
        res += it->second;
    }
}

And same for your printing loop:

if (res < m1[i]) res = m1[i];

Should be:

int m1i = m1[i];
if (res < m1i) res = m1i;

And lastly, this line is undefined behavior:

i = ( i + j ) - ( j = i );
                ^^^^^^^^^

The modification of j in the 2nd expression is unsequenced with the respect to the access of it in the first expression.

answered Feb 09 '15 at 12:53

Barry

286,269
29
621
977

i = ( i + j ) - ( j = i ); This line is a one line swap but I will now not include this in my practice – Deepanshu Feb 09 '15 at 13:04
`auto it = m1.find(n);` Also if I am finding value using find function then why I a creating a hash?? – Deepanshu Feb 09 '15 at 13:06
1

@geeksoul it tries to do a one-line swap, but since it's undefined behavior, who knows what it actually does. Could work. Could not work. So you absolutely need to change that practice. Just do std::swap(i, j); – nos Feb 09 '15 at 13:07
@geeksoul `m1[n]` also does a hash lookup. It's basically a `find()` that if it doesn't find anything, then inserts something with a default value. – Barry Feb 09 '15 at 13:09
if m1[n] is also a find then definitely it should be slower then arr1[n] because it is just arr+n ?? if all the above is true then why to use unordered_map? – Deepanshu Feb 09 '15 at 13:13
1

@geeksoul You use it when you can't use an array. What if your keys are non-contiguous integers? You're not just going to just make a HUGE array. What if your keys can be negative? What if your keys aren't even numbers? – Barry Feb 09 '15 at 13:22
@Barry thanks I got it. In my case better to use array and if my hash table has size of 10^9 use map. Am I right? – Deepanshu Feb 09 '15 at 14:26
1

@geeksoul The size doesn't matter. Only the distribution of keys. If you need to store every integer from 0 to 10^9, definitely use array. If you only need to store like 1, 50, 2000, and 10^9... definitely use a hashtable since that would be a ridiculous waste of memory otherwise. – Barry Feb 09 '15 at 15:45

Why Unordered_Map hashing is too slow then my simple array hash

1 Answers1

Linked