How to "reduce" array of hashes with duplicate keys to nested hash?

Question

Note: There were a few similar questions on SO about this, like here and here, but none seem quite like what I'm looking for.

Say I have an array of hashes like this:

arr_with_dup_hsh_keys = [
  { foo: "dup", bar: 1 },
  { foo: "dup", bar: 2 },
  { foo: "dup", bar: 3 },
  { foo: "dup", bar: 4 },
  { foo: "dup", bar: 5 }
]

How do I reduce that down to this?

{ foo: "dup", bars: [1, 2, 3, 4, 5] }

And what if there are different values for foo?

arr_with_dup_hsh_keys = [
  { foo: "dup",  bar: 1 },
  { foo: "dup",  bar: 2 },
  { foo: "soup", bar: 3 },
  { foo: "dup",  bar: 4 },
  { foo: "soup", bar: 5 }
]

Yes, I've tried multiple attempts using `group_by` and `inject` but nowhere near close to what I'm trying to achieve. — binarymason, Sep 08 '16 at 19:09
`{ foo: "dup", bars: arr_with_dup_hsh_keys.map {|hsh| hsh[:bar] } }`. — Jordan Running, Sep 08 '16 at 19:16

Cary Swoveland · Accepted Answer · 2018-08-04T17:19:41.443

def combine(arr)
  arr.group_by {|g|g[:foo]}.map {|_,a|{foo: a.first[:foo], bar: a.map {|g| g[:bar]}}}
end

combine arr_with_dup_hsh_keys
  #=> [{:foo=>"dup", :bar=>[1, 2, 3, 4, 5]}]

arr_with_dup_hsh_keys1 = [
  { foo: "dup",  bar: 1 },
  { foo: "dup",  bar: 2 },
  { foo: "soup", bar: 3 },
  { foo: "dup",  bar: 4 },
  { foo: "soup", bar: 5 }
]

combine arr_with_dup_hsh_keys1
  #=> [{:foo=>"dup", :bar=>[1, 2, 4]}, {:foo=>"soup", :bar=>[3, 5]}]

See Enumerable#group_by and note that

arr_with_dup_hsh_keys1.group_by { |g| g[:foo] }
 #=> {"dup"=> [{:foo=>"dup", :bar=>1}, {:foo=>"dup", :bar=>2},
 #             {:foo=>"dup", :bar=>4}],
 #    "soup"=>[{:foo=>"soup", :bar=>3}, {:foo=>"soup", :bar=>5}]}

You could alternatively write the following.

def combine(arr)
  arr.each_with_object({}) do |g,h|
    f = g.merge(bar: [g[:bar]])
    h.update(f[:foo]=>f) { |_,o,n| { foo: o[:foo], bar: o[:bar]+n[:bar] } }
  end.values
end

combine arr_with_dup_hsh_keys1
  #=> [{:foo=>"dup", :bar=>[1, 2, 4]}, {:foo=>"soup", :bar=>[3, 5]}]

This uses the form of Hash#update (aka merge!) that employs a block to determine the values of keys that are present in both hashes being merged. See the doc for an explanation of the three block variables (the first being the common key, which I've represented with an underscore to signify that it's not used in the block calculation).

this is exactly what I needed. You must have sensed that I also needed to differentiate when `foo` could be `bar` or `soup`. Thank you! — binarymason, Sep 08 '16 at 19:39
I initially started out using `group_by` but was having trouble because that created one more array level. How would you continue the method using `group_by`? (if you don't mind) — binarymason, Sep 08 '16 at 19:55
Actually, I just got it using that method by applying Jordan's answer. That is also another option. Thank you again! — binarymason, Sep 08 '16 at 19:59
As you see, I've shown how both methods can be used, with the one using `group_by` (which you also came up with) being preferred. — Cary Swoveland, Sep 08 '16 at 20:14

score 1 · Answer 2 · answered Sep 08 '16 at 19:21

1

If your data is really as simple as in your question, this will do what you want:

{ foo: "dup",
  bars: arr_with_dup_hsh_keys.map {|hsh| hsh[:bar] }
}

answered Sep 08 '16 at 19:21

Jordan Running

102,619
17
182
182

Data is really not that simple. I may have hundreds of `foo`s that have a range of values and many of them are duplicates like the example above. – binarymason Sep 08 '16 at 19:35

score 1 · Answer 3 · answered Sep 08 '16 at 19:30

This is what I came up:

a = [
  { foo: "dup", bar: 1 },
  { foo: "dup", bar: 2 },
  { foo: "dup", bar: 3 },
  { foo: "dup", bar: 4 },
  { foo: "dup", bar: 5 }
]

h = {}
a.map(&:keys).uniq.flatten.each_with_index do |key, idx|
  h[key] = a.map(&:values).collect { |a| a[idx]}.uniq
end
h
#=> {:foo=>["dup"], :bar=>[1, 2, 3, 4, 5]}

How to "reduce" array of hashes with duplicate keys to nested hash?

3 Answers3