0

Here is a code

irb(main):085:0> h = Hash.new([])
=> {}
irb(main):086:0> h['a'] = 'sdfds'
=> "sdfds"
irb(main):087:0> h
=> {"a"=>"sdfds"}
irb(main):088:0> h['b'].push(h['a'] )                                                                                                                                                              
=> ["sdfds"]
irb(main):089:0> h
=> {"a"=>"sdfds"}
irb(main):090:0> h['b']
=> ["sdfds"]
irb(main):091:0> h['c']                                                                                                                                                                            
=> ["sdfds"]
irb(main):092:0> h
=> {"a"=>"sdfds"}

What I was trying to do was make h[b] act like a usual Array. However what happens is h[b] and h[c] now have a new default value. I expected h[b] to have this new value but seems like push doesn't actually push to a non existent value.

and then printing h actually shows just h[a]

why is this? And this is really not worth a try although ruby is used widely but these are the kind of peculiar behaviors that might not be preferred. and it varies person to person.

UPDATE However the right behavior isn't showing anything fishy:

irb(main):104:0> h = Hash.new([])
=> {}
irb(main):105:0> h['a'] = [1,2,3,'dsfds']                                                                                                                                                          
=> [1, 2, 3, "dsfds"]
irb(main):106:0> h['b'] += h['a']                                                                                                                                                                  
=> [1, 2, 3, "dsfds"]
irb(main):107:0> h
=> {"a"=>[1, 2, 3, "dsfds"], "b"=>[1, 2, 3, "dsfds"]}

Another unexpected and confusing behavior

irb(main):093:0> h = [1,2,3]
=> [1, 2, 3]
irb(main):094:0> h.shift(0)
=> []
irb(main):095:0> h
=> [1, 2, 3]
irb(main):096:0> h.unshift(0)
=> [0, 1, 2, 3]
irb(main):097:0> h.shift(10)                                                                                                                                                                       
=> [0, 1, 2, 3]
irb(main):098:0> h.shift(90)
=> []
irb(main):099:0> h
=> []
irb(main):100:0> h = [1,2,3]                                                                                                                                                                       
=> [1, 2, 3]
irb(main):101:0> h.shift(100)
=> [1, 2, 3]
irb(main):102:0> h
=> []
irb(main):103:0> h.shift(90)
=> []

Im not even gona ask the question here if it doesnt surprise you enough but I would like to read some explanations on such weird behavior. Makes me think if I should use it in production environemnt at all.

user2290820
  • 2,709
  • 5
  • 34
  • 62

3 Answers3

1

new(obj) → new_hash

If this hash is subsequently accessed by a key that doesn’t correspond to a hash entry, the value returned depends on the style of new used to create the hash. In the first form, the access returns nil. If obj is specified, this single object will be used for all default values.

Here, you created a hash object, with the default value as same array for all non existing keys, for h.

h = Hash.new([])

Now here you called actually Hash#[]=. Thus key :a added to the hash, with a value associated with it as 10.

h[:a] = 10

A key will be added to the hash, only when Hash#[]= will be called, but your h[:b] is same as Hash#[]. Thus :b is not added as a key, rather it gives you back the default array, you set, with the line h = Hash.new([]). Thus h[:b] actually gives you the default array back, and on which you are calling Array#push method.

h[:b].push(11)

Thus as per just above explanation, you can't see the key :b inside the hash h.

h # => {:a=>10}

h[:b].push(11) causes the default array to be having now only element, which is [11]. Thus now h[:c], gives you the default array back, which is [11].

h[:c] # => [11]

Array#shift(n) means actually array.slice!(0, n). Now Array#sice! tells us - Deletes the element(s) given by an index (optionally up to length elements) or by a range.Returns the deleted object (or objects), or nil if the index is out of range.

Why h.shift(0) # => [] ?

As per the documentation shift(n) → new_ary, if you pass argument, then you will get the result as an array. Now here you supplied 0, which means, you don't want, to remove any elements from h. So by definition you got an empty array.


Look at the code below :

(arup~>~)$ pry --simple-prompt
>> h = [1,2,3]
=> [1, 2, 3]
>> h.shift(10)
=> [1, 2, 3]
>> h
=> []

Now read again the doc If a number n is given, returns an array of the first n elements (or less) just like array.slice!(0, n) does.. h.shift(10) gives you [1, 2, 3], which is 3 elements array and also removing the elements from h. Thus last h gives you [].


h = Hash.new([])
h['a'] = [1,2,3,'dsfds']
h['b'] += h['a']
h # => {"a"=>[1, 2, 3, "dsfds"], "b"=>[1, 2, 3, "dsfds"]}

Here, h['b'] += h['a'] means actually, h['b'] = h['b'] + h['a']. Now your hash don't have the key 'b', so h['b'] is giving you the default array []. Now the line h['b'] = h['b'] + h['a'] becomes h[b] = [] + [1,2,3,'dsfds']. Now [] + [1,2,3,'dsfds'] will give you [1,2,3,'dsfds'], which is nothing but a Array#+ method call. At last h['b'] += h['a'], is a simply Hash#[]= method call, thus key "b" got added to the hash h, with a value [1,2,3,'dsfds']. Now h['a'] and h['b'] showing same array, but NO, those array are not same array objects, but Yes, they are containing the same elements. Remember Array#+ created a new array object.

Arup Rakshit
  • 116,827
  • 30
  • 260
  • 317
1

According to the documentation Hash#new

Returns a new, empty hash. If this hash is subsequently accessed by a key that doesn’t correspond to a hash entry, the value returned depends on the style of new used to create the hash. In the first form, the access returns nil. If obj is specified, this single object will be used for all default values.

This means that when specifying an object to the constructor, it will be returned as default value, which is different than set as the value.

default_value = []
=> []
h = Hash.new(default_value)
=> {}
h['b'].push 'asdf'
=> ['asdf']
default_value
=> ['asdf']

When you are using the += operator you are actually assigning a new value - h['b'] += h['a'] is like saying h['b'] = h['b'] + h['a'], which is again like saying h['b'] = default_value + h['a']

Regarding the behavior of Array.shift:

If a number n is given, returns an array of the first n elements (or less) just like array.slice!(0, n) does. With ary containing only the remainder elements, not including what was shifted to new_ary. See also #unshift for the opposite effect.

h.shift(100), for example, will return all the elements in the array ([1,2,3]), and leave the array with no elements ([])

Uri Agassi
  • 36,848
  • 14
  • 76
  • 93
1

When creating a new hash, there are two methods available for defining a default value to be returned when a matching element cannot be retrieved -- as a value passed to the new method or through a block (also passed to new) which will be evaluated each time a "hash miss" occurs.

Passing a default value to new works great for simple values, integers, so this works great:

h = Hash.new(0)

You get what you would expect. Calling h[:no_such_key_here] returns a 0 instead of a nil.

The problem comes when you try to use that syntax on complex objects (like hashes or arrays). Since the default value is stored by reference, it's possible to inadvertently change the default value in unexpected ways. Try this:

h = Hash.new([])
h[:a].__id__ == h[:b].__id__

This, you might be surprised to learn, will return true. Not only is an array returned for missing keys, but THE SAME array is returned. So changing it as h[:a] will change it as h[:b].

Probably not what you want, which brings us to the second way to define a default value -- using a block:

h = Hash.new {|hash, key| hash[key] = [] }
h[:a].__id__ == h[:b].__id__

By using a block that is evaluated each time a missing key is referenced, a new, unique hash is created each time.

A little tricky, but it makes some sense. It's the same sort of problem you run into with shallow copies of these nested objects using dup. But that's another topic!

Nathan Fritz
  • 2,268
  • 3
  • 20
  • 14
  • thanks for the insight. shallow copying just references to the same object so declaring Hash.new just once and cmpring object ids gives same value correct? in block u r just declaring new dictionary each time so thts self explanatory. – user2290820 Feb 01 '14 at 19:20
  • 1
    Yup. If you use Hash.new([]) or Hash.new({}) the default value will be the exact same instance of the array or hash object every time (same object id), so changing it anywhere changes them all. Even Hash.new([].dup) doesn't get around this, since it'll dupe it once, then use the copy over and over. Using the block, it creates a new instance for each one as required. – Nathan Fritz Feb 02 '14 at 00:20