42

I thought I understood what the default method does to a hash...

Give a default value for a key if it doesn't exist:

irb(main):001:0> a = {}
=> {}
irb(main):002:0> a.default = 4
=> 4
irb(main):003:0> a[8]
=> 4
irb(main):004:0> a[9] += 1
=> 5
irb(main):005:0> a
=> {9=>5}

All good.

But if I set the default to be a empty list, or empty hash, I don't understand it's behaviour at all....

irb(main):001:0> a = {}
=> {}
irb(main):002:0> a.default = []
=> []
irb(main):003:0> a[8] << 9
=> [9]                          # great!
irb(main):004:0> a
=> {}                           # ?! would have expected {8=>[9]}
irb(main):005:0> a[8]
=> [9]                          # awesome!
irb(main):006:0> a[9]
=> [9]                          # unawesome! shouldn't this be [] ??

I was hoping/expecting the same behaviour as if I had used the ||= operator...

irb(main):001:0> a = {}
=> {}
irb(main):002:0> a[8] ||= []
=> []
irb(main):003:0> a[8] << 9
=> [9]
irb(main):004:0> a
=> {8=>[9]}
irb(main):005:0> a[9]
=> nil

Can anyone explain what is going on?

Martin Konecny
  • 57,827
  • 19
  • 139
  • 159
mat kelcey
  • 3,077
  • 2
  • 30
  • 35

7 Answers7

56

This is a very useful idiom:

(myhash[key] ||= []) << value

It can even be nested:

((myhash[key1] ||= {})[key2] ||= []) << value

The other way is to do:

myhash = Hash.new {|hash,key| hash[key] = []}

But this has the significant side-effect that asking about a key will create it, which renders has_key? fairly useless, so I avoid this method.

glenn mcdonald
  • 15,290
  • 3
  • 35
  • 40
  • 5
    I don't think sure the side-effect of the last technique is present in ruby 1.9.2. myhash = Hash.new {|hash,key| hash[key] = []}; myhash.has_key?(:test) #=> false – Chris Lowis Jan 04 '12 at 15:24
  • 3
    Oh, no, I meant that `puts myhash[:test]` or the like, which seems like it should be harmless, will now result in `myhash.has_key?(:test)` being true afterwards. – glenn mcdonald Jan 16 '12 at 03:47
51

Hash.default is used to set the default value returned when you query a key that doesn't exist. An entry in the collection is not created for you, just because queried it.

Also, the value you set default to is an instance of an object (an Array in your case), so when this is returned, it can be manipulated.

a = {}
a.default = []     # set default to a new empty Array
a[8] << 9          # a[8] doesn't exist, so the Array instance is returned, and 9 appended to it
a.default          # => [9]
a[9]               # a[9] doesn't exist, so default is returned
Aaron Hinni
  • 14,578
  • 6
  • 39
  • 39
  • 1
    I'd like to point out that this behavior is different from python's defaultdict, where the analogous code works just fine. – Stumpy Joe Pete Oct 10 '12 at 07:57
  • 2
    Wow, this is horrible, and terribly prone to bugs by dynamically changing this default value. I just spent 3 hours trying to figure out wtf was going on when attempting to set default to []. IMNSHO, default should not be exposed like this. The alternative (h=Hash.new{|h,k| h[k]=[]}) doesn't really do what I want either because that always creates a new array by just referencing the hash key. Now I'm back to doing what I was trying to avoid in the first place: h[k]=[] unless h.has_key?(k) :) – Steeve McCauley Nov 15 '13 at 17:33
  • 2
    I hope you guys didn't down-vote this answer just for that peculiarity of Ruby itself :) What's wrong with this excellently explanatory answer? I myself hadn't completely gotten my head around the concept until I saw the emphasis on "returned" and the explanation on the importance of the value of `default` being an object (so that it's kept around in subsequent non-existent key calls). – Halil Özgür Mar 18 '14 at 13:49
34

I think this is the behavior you are looking for. This will automatically initialize any new keys in the Hash to an array:

irb(main):001:0> h = Hash.new{|h, k| h[k] = []}
=> {}
irb(main):002:0> h[1] << "ABC"
=> ["ABC"]
irb(main):003:0> h[3]
=> []
irb(main):004:0> h
=> {1=>["ABC"], 3=>[]}
Turp
  • 937
  • 1
  • 6
  • 11
9

glenn mcdonald says:

"The other way is to do:

myhash = Hash.new {|hash,key| hash[key] = []}

But this has the significant side-effect that asking about a key will create it, which renders has_key? fairly useless, so I avoid this method."

that does not in fact seem to be true.

irb(main):004:0> a = Hash.new {|hash,key| hash[key] = []}
=> {}
irb(main):005:0> a.has_key?(:key)
=> false
irb(main):006:0> a[:key]
=> []
irb(main):007:0> a.has_key?(:key)
=> true

Accessing the key will create it, as I would expect. Merely asking has_key? does not.

jrochkind
  • 22,799
  • 12
  • 59
  • 74
9

If you really wanna have an endlessly deep hash:

endless = Hash.new { |h, k| h[k] = Hash.new(&h.default_proc) }
endless["deep"]["in"]["here"] = "hello"

Of course, as Glenn points out above, if you do this, the has_key? looses its meaning as it will always return true. Thx to jbarnette for this one.

migbar
  • 116
  • 1
  • 2
6
irb(main):002:0> a.default = []
=> []
irb(main):003:0> a[8] << 9
=> [9]                          # great!

With this statement, you have modified the default; you have not created a new array and added "9". At this point, it's identical to if you had done this instead:

irb(main):002:0> a.default = [9]
=> [9]

Hence it's no surprise that you now get this:

irb(main):006:0> a[9]
=> [9]                          # unawesome! shouldn't this be [] ??

Furthermore, the '<<' added the '9' to the array; it did not add it to the hash, which explains this:

irb(main):004:0> a
=> {}                           # ?! would have expected {8=>[9]}

Instead of using .default, what you probably want to do in your program is something like this:

# Time to add a new entry to the hash table; this might be 
# the first entry for this key..
myhash[key] ||= []
myhash[key] << value
Simon Howard
  • 8,999
  • 5
  • 28
  • 21
-4

I'm not sure if this is what you want, but you can do this to always return an empty array when a missing hash key is queried.

h = Hash.new { [] }
h[:missing]
   => []

#But, you should never modify the empty array because it isn't stored anywhere
#A new, empty array is returned every time
h[:missing] << 'entry'
h[:missing]
   => []
Daniel Beardsley
  • 19,907
  • 21
  • 66
  • 79