2

Possible Duplicate:
Ruby method Array#<< not updating the array in hash
Strange ruby behavior when using Hash.new([])

I've been doing Koans which is great, and as I go along I find no major trouble, but I stumbled upon this, and can't make any sense out of it:

    def test_default_value_is_the_same_object
        hash = Hash.new([])

        hash[:one] << "uno"
        hash[:two] << "dos"

        assert_equal ["uno", "dos"], hash[:one]   # But I only put "uno" for this key!
        assert_equal ["uno", "dos"], hash[:two]   # But I only put "dos" for this key!
        assert_equal ["uno", "dos"], hash[:three] # I didn't shove anything for :three!

        assert_equal true, hash[:one].object_id == hash[:two].object_id 
    end

All the tests are passing (I just looked at the error which helped me guess the right assertions to write).

The last assert, ok, they both were not initialized so their values have got to have the same object ID since they both take the default.

I don't understand why the default value was altered, I'm not even entirely sure that's what happened.

I tried it out in IRB, thinking maybe some tampering on Hash/Array was done to make me crazy, but I get the same result.

I first thought hash[:one] << "uno" would imply hash to become { one: ["uno] }, but it remains { }.
Although I'm guessing << only calls push, and new keys are only added when you use the = sign

Please tell me what I missed.

EDIT: I'm using Ruby 1.9.3

Community
  • 1
  • 1
Louis Kottmann
  • 16,268
  • 4
  • 64
  • 88
  • 1
    There is only *one* array: the value (`[]`) is evaluated *before* the `new` method is called. Try the "default form" that takes a block (to create a *new* array for real each time the block is yielded/called). I am sure this is a duplicate .. –  Nov 11 '12 at 23:03
  • Yeah, got bitten by this a couple of times too :) – Sergio Tulentsev Nov 11 '12 at 23:04
  • @pst could you elaborate? I'm not sure I'm following. And I didn't find any similar questions in SO/Google/Duckduckgo but I can always miss stuff ;) – Louis Kottmann Nov 11 '12 at 23:06
  • @pst it is weird because my hash remains empty, but in the question you linked, his hash gets populated. Maybe it was modified in Ruby since? – Louis Kottmann Nov 11 '12 at 23:08
  • Mmmmh maybe I'm getting it: is it because since we didn't add they key to the hash, `hash[:whatever]` returns the default value of `[]` and that value is modified by `<<` (which is really a `push`, so does not call `=`) ? – Louis Kottmann Nov 11 '12 at 23:13
  • @Baboon: Yes, that's why. See my answer for details. – Gregory Brown Nov 11 '12 at 23:26

2 Answers2

3

When you use the default argument for a Hash, the same object is used for all keys that have not been explicitly set. This means that only one array is being used here, the one you passed into Hash.new. See below for evidence of that.

>> h = Hash.new([])
=> {}
>> h[:foo] << :bar
=> [:bar]
>> h[:bar] << :baz
=> [:bar, :baz]
>> h[:foo].object_id
=> 2177177000
>> h[:bar].object_id
=> 2177177000

The weird thing is that as you found, if you inspect the hash, you'll find that it is empty! This is because only the default object has been modified, no keys have yet been assigned.

Fortunately, there is another way to do default values for hashes. You can provide a default block instead:

>> h = Hash.new { |h,k| h[k] = [] }
=> {}
>> h[:foo] << :bar
=> [:bar]
>> h[:bar] << :baz
=> [:baz]
>> h[:foo].object_id
=> 2176949560
>> h[:bar].object_id
=> 2176921940

When you use this approach, the block gets executed every time an unassigned key is used, and it is provided the hash itself and the key as an argument. By assigning the default value within the block, you can be sure that a new object will get created for each distinct key, and that the assignment will happen automatically. This is the idiomatic way of creating a "Hash of Arrays" in Ruby, and is generally safer to use than the default argument approach.

That said, if you're working with immutable values (like numbers), doing something like Hash.new(0) is safe, as you'll only change those values by re-assignment. But because I prefer to keep fewer concepts in my head, I pretty much use the block form exclusively.

Gregory Brown
  • 1,380
  • 9
  • 15
1

When you do

h = Hash.new(0)
h[:foo] += 1

you are directly modifying h. h[:foo] += 1 is the same as h[:foo] = h[:foo]+1. h[:foo] is being assigned 0+1.

When you do

h = Hash.new([])
h[:foo] << :bar

you are modifying h[:foo] which is [], which is the default value of h but is not a value to any key of h. After that [] becomes [:bar], the default value of h becomes [:bar], but that is not the value for h[:foo].

sawa
  • 165,429
  • 45
  • 277
  • 381