How expensive is the "key" function of a Ruby hash?

Question

I'm using a couple of Hashes where the values of some hashes are the key of others.

I need to use key a couple of times to get the key for a value so that I can use it to access something in another hash.

I was wondering what kind of a performance impact this could possibly have. In my situation, these hashes are few in number and the contents are small, but I want to know theoretically.

Should I avoid using this too much? How does it perform compared to getting the value for a key?

the Tin Man · Accepted Answer · 2013-05-02T14:19:31.553

Think of it this way: you're occasionally doing an extra step to get the value. That's what happens any time you use a conditional test and add a couple steps to a computation.

It's obvious there's a little overhead associated with it, but worrying about it at this point is premature optimization. You CAN get a feel for the difference, by using the Benchmark class to test your alternate way of getting the hash key, vs. the normal way.

I suspect you'll have to do several million loops to see an appreciable difference.

Here is how I create the reverse mapping mentioned by @fontanus:

hash = {a:1, b:2}
hash.merge!(Hash[hash.values.zip(hash.keys)])

Which results in:

{
    :a => 1,
    :b => 2,
     1 => :a,
     2 => :b
}

It can also be done by coercing the hash into an array, flattening it and reversing it and then turning it back into a hash, but I find this less intuitive than the above. YMMV.

hash.merge!(Hash[*hash.to_a.flatten.reverse])

@steenslag reminded me of Hash.invert. I knew there was something but couldn't remember the method name:

>> hash.merge!(hash.invert)
{
    :a => 1,
    :b => 2,
     1 => :a,
     2 => :b
}

Give him an upvote for that!

Previous optimization, along with other things, is the root of all evil. — fotanus, May 02 '13 at 13:28
Hash has a method for this: [hash.invert](http://ruby-doc.org/core-2.0/Hash.html#method-i-invert) — steenslag, May 02 '13 at 14:10
+1 HA! That's what it is. I knew there was one, but couldn't remember it. You could see my Perl leaking into my answer. :-) — the Tin Man, May 02 '13 at 14:16

fotanus · Answer 2 · 2013-05-02T13:42:41.570

1

Searching in ruby 1.9.3 and 2.0.0 are O(n) operations.

static VALUE
rb_hash_key(VALUE hash, VALUE value)
{
    VALUE args[2];

    args[0] = value;
    args[1] = Qnil;

    rb_hash_foreach(hash, key_i, (VALUE)args);

    return args[1];
}

Implementation of rb_hash_foreach:

void
rb_hash_foreach(VALUE hash, int (*func)(ANYARGS), VALUE farg)
{
    struct hash_foreach_arg arg; 

    if (!RHASH(hash)->ntbl)
        return;
    RHASH_ITER_LEV(hash)++;
    arg.hash = hash;
    arg.func = (rb_foreach_func *)func;
    arg.arg  = farg;
    rb_ensure(hash_foreach_call, (VALUE)&arg, hash_foreach_ensure, hash);
}

Yet, your hashes are small. @theTinMan is correct about being premature optimization, and you should not worry about it.

edited May 02 '13 at 13:42

answered May 02 '13 at 12:55

fotanus

19,618
13
77
111

1

(1) This is not true. In Ruby 1.9, which is previous to Ruby 2.0, hash is ordered. (2) How does the fact that the hash is ordered or not have impact on `key`? I don't see any reason. – sawa May 02 '13 at 13:05
@sawa (1) thanks you are correct, it is not Ruby 2, it is 1.9.2. - wrong number 2. (2) You can't find an element in an unordered list faster than a linear search, but you can use other search algorithms if you know that your list is ordered – fotanus May 02 '13 at 13:08
2

In what sense are taking "ordered"? In the context where Ruby hash is said to be ordered, it does not mean that the keys are "sorted". It means that the order of the keys inserted is preserved. – sawa May 02 '13 at 13:11
@sawa Hum... you mean that the structure is not hold in order, but every time you want to iterate it is sorterd? Seems a bit odd, but can be... Anyway, striking the first sentence since I'm not sure. – fotanus May 02 '13 at 13:13
You don't need to use strike-out to change your text, just delete it. Stack Overflow maintains the history of all edits so if anyone was concerned about a deletion they can check it. Seeing edits is a privilege gained as your points accumulate. – the Tin Man May 02 '13 at 13:22
"This will increase the mess of your code, but will decrease the runtime.". That's debatable. I often use forward and reverse lookups inside one hash, as long as I'm sure there will be no collisions. The advantage is knowing I only have one place to look, simplifying the logic, however I'd never consider it a speed-up, which would be minimal at best. That the hash has two different types of values, forward and back, is inconsequential because it doesn't make Ruby work any harder, and I'm treating the hash as a lookup table, which it is anyway. – the Tin Man May 02 '13 at 13:28
makes sense, and I agree that hardly you will see a performance increase, but the OP didn't revealed to us his hash sizes, and so I'm trusting that he is having performance problems. Anyway, the question is "How expensive is the “key” function of a Ruby hash?", the rest of my answer is just to give some guidance in case of need. – fotanus May 02 '13 at 13:31
@theTinMan can you read the last sentence again and see if you have comments on it? – fotanus May 02 '13 at 13:33
1

@fotanus “every time you want to iterate it is sorted?” – no. It is never sorted automatically. Being ordered means that each traversal yields the elements in the same order (which wasn't the case before 1.9), namely the order in which the elements were inserted. – Patrick Oscity May 02 '13 at 13:37
@fotanus, `"the OP didn't revealed to us his hash sizes, and so I'm trusting that he is having performance problems"`. No, the OP said `"...these hashes are few in number and the contents are small, but I want to know theoretically"`. That's why it's premature optimization. – the Tin Man May 02 '13 at 13:41
@fotanus I hope you won't take offense in this, but I accepted the Tin Man's answer because it was more meaningful to me than simply a time complexity. I'm ashamed to say that I don't understand time complexity **yet** (I really want to learn), but many other user's don't either. The Tin Man gave a little more explanation etc... – MarioDS May 02 '13 at 13:57
@MarioDeSchaepmeester no offences taken, you should accept the one that was most useful for you. It doesn't mean that the others are useless. I'm sure that other people that end up in this page would like to know the complexity. (as I wanted to know when read your question) – fotanus May 02 '13 at 14:03
@fotanus that's what I was hoping you would know ;) – MarioDS May 02 '13 at 14:04

How expensive is the "key" function of a Ruby hash?

2 Answers2