2

I expected:

h = Hash.new([])
h['a'] << 'b'
h['a'] << 'c'
h # => {}

to give {'a' => ['b','c']}, not an empty hash.

I also found out that the insert operation targets the default value, because after the code above it is euqal to ['b','c']:

h.default # => ['b','c']

I am looking for an explanation on why it does not work and how to do it optimally so it works.

gorn
  • 5,042
  • 7
  • 31
  • 46
  • Tested, appeared strange, update if you get it in details. – ray Dec 04 '18 at 13:56
  • What details do you need? It is already strange ... – gorn Dec 04 '18 at 14:03
  • clarification for above strange thing :) – ray Dec 04 '18 at 14:05
  • 2
    Well I quess that it is what the answers are for ;) – gorn Dec 04 '18 at 14:12
  • 3
    The default value is just that a default return value when the key does not exist. It has nothing to do with key value assignment. @SergioTulentsev has shown you how to "do it so it works" – engineersmnky Dec 04 '18 at 14:15
  • Another way that is commonly done: `h = {}; (h['a'] ||= []) << 'b'; h #=> {'a'=>['b']}; (h['a'] ||= []) << 'c'; h #=> {'a'=>['b', 'c'])`. Initially `h.key('a') #=> false`, so `(h['a'] ||= []) << 'b' #=> (h['a'] = h['a] || []) << 'b' #=> (h['a'] = nil || []) << 'b' #=> (h['a'] = []) << 'b' #=> h['a'] << 'b'; h #=> {'a'=>['b']`. – Cary Swoveland Dec 04 '18 at 20:58
  • What does not work? How to do what? – sawa Dec 05 '18 at 07:27

3 Answers3

7

The reason why your line didn't work is that Hash, upon accessing a missing key, simply returns the default value (whatever you specified), without assigning it to the key. And since your default value is a complex mutable object (and it's the very same object that is returned every time), you get what you observed: all values are shoveled straight into the default value, bypassing the hash. This is probably the most common mistake with hashes and mutable default values.

To do what you want, use the third form of Hash.new

new {|hash, key| block } → new_hash

like this, for example

h = Hash.new {|h, k| h[k] = [] }
sawa
  • 165,429
  • 45
  • 277
  • 381
Sergio Tulentsev
  • 226,338
  • 43
  • 373
  • 367
  • but we passed `[]` as an argument for new not as a block code. Can you elaborate more? – ray Dec 04 '18 at 14:07
  • 1
    @ray: yeah, that form of setting default value doesn't work as "expected" with mutable objects (array here) – Sergio Tulentsev Dec 04 '18 at 14:10
  • @SergioTulentsevI have elaborated the answer so it is more usefull for other people, please do corrections if you feel so inclined. – gorn Dec 04 '18 at 14:28
  • 1
    The [docs](http://ruby-doc.org/core-2.5.3/Hash.html#method-c-new) explicitly state: _"It is the block's responsibility to store the value in the hash if required."_ – Stefan Dec 04 '18 at 14:36
  • @Stefan you are right, althought is seems to be more connected to the "block" style. – gorn Dec 04 '18 at 15:38
  • @gorn the documentation for `Hash.new` gets more specific: with a positional argument, i.e. `Hash.new([])` you get a single default object. With a block argument, i.e. `Hash.new { [] }` you do get multiple copies, but they are not persisted. Only with `Hash.new { |h, k| h[k] = [] }` you get one copy per key. So yes, it is connected to the block style because a block is required to get this mechanism working. Without a block, the default value is simply returned and no keys are being set. – Stefan Dec 05 '18 at 09:27
  • @SergioTulentsev It was tedious but I got it, thanks :) – ray Dec 05 '18 at 11:12
0

It's because you modify this specific object you passed as a default value. So:

h = Hash.new([])
h['a'] << 'b'
h['a'] << 'c'
h['b'] # or h['a'] or h[:virtually_anything]
# => ["b", "c"]
Marek Lipka
  • 50,622
  • 7
  • 87
  • 91
  • I thought that `array << x` is something like `array = array + [x]` – gorn Dec 04 '18 at 13:59
  • @gorn: no, it's not. – Sergio Tulentsev Dec 04 '18 at 14:00
  • Well, obviously it is not, but according to documentation << "Pushes the given object on to the end of this array." So how it differs from `array = array + [x]` (which also places x as last element of array). – gorn Dec 04 '18 at 14:09
  • @gorn: "which also places x as last element of array)" - not quite. It creates a __new__ array, rather than modifying the existing one in-place. – Sergio Tulentsev Dec 04 '18 at 14:14
0

It's because h has no key 'a', you need to initialize it before or it's just a default value reset:

h = Hash.new([])
h['a'] = ['b']
h['a'] << 'c'

h['a'] #=> ["b", "c"]
h #=> {"a"=>["b", "c"]}

This behave the same:

k = Hash.new
k.default = []

While, as explained by Sergio Tulentsev, (https://stackoverflow.com/a/53614695/5239030) this creates the key "on the fly", try this:

k = Hash.new {|h, k| puts "Just created a new key: #{k}"; h[k] = [] }
p k['a'] << 'a'
p k['a'] << 'a'
p k['b'] << 'b'
p k
iGian
  • 11,023
  • 3
  • 21
  • 36
  • Should not `array << 'b'` be identical to `array= ['b']` on empty arrays? – gorn Dec 04 '18 at 13:57
  • I.e. should not `h['a'] << 'b'` behave like `h['a'] = ['b']` if `h['a']` is empty array? – gorn Dec 04 '18 at 13:58
  • No, there's no reason for it to behave this way. It's all about what object are you currently modifying. When you're doing `h['a'] = ['b']`, you're creating a new `Array` object and assign it to `'a'` key in `h` `Hash`. – Marek Lipka Dec 04 '18 at 13:59
  • If you look at http://tpcg.io/UBS8FQ than you see the reason. For an empty array it behaves identically. – gorn Dec 04 '18 at 14:02
  • You say "It's because h has no key 'a', you need to initialize it" but is not the default value supposed to do that automagically? – gorn Dec 04 '18 at 14:05
  • 1
    @gorn: "but is not the default value supposed to do that automagically?" - one could argue that this would be a sensible approach. Alas, wording in Hash documentation is pretty unambiguous. "Retrieves the value object corresponding to the key object. If not found, returns the default value" – Sergio Tulentsev Dec 04 '18 at 14:09
  • If the `h['a'] += ['b']` would not work as well than I would not complain, but to me `array << 'b'` and `array += ['b']` looks very identical – gorn Dec 04 '18 at 14:11
  • 1
    @gorn: well, that's your misunderstanding of how these two work. The latter _reassigns_ array. That's the key difference. – Sergio Tulentsev Dec 04 '18 at 14:12
  • @gorn JFYI, you should `@mention` people to whom you're responding. Or they may never see your comments. – Sergio Tulentsev Dec 04 '18 at 14:16
  • @gorn, Sergio Tulentsev already gave a great explanation, I just added an edit to my post based on it. – iGian Dec 04 '18 at 14:33
  • @gorn `h['a']` and `h['a']=` are two different methods. The [former](http://ruby-doc.org/core-2.5.3/Hash.html#method-i-5B-5D) returns the value for key `'a'` (which might be the default value) whereas the [latter](http://ruby-doc.org/core-2.5.3/Hash.html#method-i-5B-5D-3D) assigns a new value to key `'a'`. If you call `h['a'] += ['b']` it is equivalent to `h['a'] = h['a'] + ['b']`, i.e. you retrieve the default value for `'a'` (empty array), concatenate `['b']` and assign the result to the key `'a'` (which then exists). – Stefan Dec 04 '18 at 15:03