Create a hash out of an array where the values are the indices of the elements

Question

I have an array and I want to create a hash whose keys are the elements of the array and whose values are (an array of) the indices of the array. I want to get something like:

array = [1,3,4,5]
... # => {1=>0, 3=>1, 4=>2, 5=>3}

array = [1,3,4,5,6,6,6]
... # => {1=>0, 3=>1, 4=>2, 5=>3, 6=>[4,5,6]}

This code:

hash = Hash.new 0
array.each_with_index do |x, y|
  hash[x] = y
end

works fine only if I don't have duplicate elements. When I have duplicate elements, it does not.

Any idea on how I can get something like this?

The return value `{1=>[0], 3=>[1], 4=>[2], 5=>[3], 6=>[4,5,6]}` would be easier to compute and I think you'd also find it more convenient for further processing. Also, please don't be vague about what you want. For example, rather than, "something like", write, "the following". — Cary Swoveland, May 28 '18 at 20:17
Try `array.each_with_index.inject(Hash.new([])) { |acc, (x, i)| acc.merge({x => acc[x] + [i]}) }`, it may be a better fit, as suggested by Cary. — Gabriel, May 28 '18 at 20:32
Note that there are some [caveats to passing `[]` to `Hash.new`](https://stackoverflow.com/q/2698460/211563) (though they’re avoided in @Gabriel’s particular snippet). — Andrew Marshall, May 28 '18 at 21:45

max pleaner · Answer 1 · 2018-05-29T07:08:55.227

3

You can change the logic to special-case the situation when the key already exists, turning it into an array and pushing the new index:

arr = %i{a a b a c}

result = arr.each.with_object({}).with_index do |(elem, memo), idx|
  memo[elem] = memo.key?(elem) ? [*memo[elem], idx] : idx
end

puts result
# => {:a=>[0, 1, 3], :b=>2, :c=>4}

It's worth mentioning, though, that whatever you're trying to do here could possibly be accomplished in a different way ... we have no context. In general, it's a good idea to keep key-val data types uniform, e.g. the fact that values here can be numbers or arrays is a bit of a code smell.

Also note that it doesn't make sense to use Hash.new(0) here unless you're intentionally setting a default value (which there's no reason to do). Use {} instead

edited May 29 '18 at 07:08

answered May 28 '18 at 19:28

max pleaner

26,189
9
66
118

Thanks for your answer! For context Im just doing some problems, this one is to find the indices of elements in an array that sum to a given number. – Rubee May 28 '18 at 20:14
@Strobes hmm well it's like Cary said in a comment .. it's better to make all values arrays – max pleaner May 28 '18 at 20:26
1

There’s nothing un-idomatic about `Hash.new(0)`, generally. It’s just that the O.P. made no use of the default value in their code, which makes it wholly unnecessary in that context. – Andrew Marshall May 28 '18 at 21:37
@AndrewMarshall ah ok, I had a brain fart and forgot what the argument did. Edited answer. – max pleaner May 28 '18 at 22:03
Small quibble to the comments - I don't think the use of Hash.new(0) is "unnecessary" here - it's explicitly incorrect. You could do this with Hash.new(0) in place of {} and you'd get the same "answer" when you inspect the hash. Where it suddenly becomes wrong is when you ask for result[:d], which should be giving you the location of the symbol :d in the array. The hash created with a default of 0 will dutifully report that result[:d] is 0, which is almost certainly not what the OP is looking for. – Marc Talbot May 29 '18 at 01:00

iGian · Accepted Answer · 2018-05-29T05:39:53.143

I'm adding my two cents:

array = [1,3,4,5,6,6,6,8,8,8,9,7,7,7]

hash = {}
array.map.with_index {|val, idx| [val, idx]}.group_by(&:first).map do |k, v|
  hash[k] = v[0][1] if v.size == 1
  hash[k] = v.map(&:last) if v.size > 1
end

p hash #=> {1=>0, 3=>1, 4=>2, 5=>3, 6=>[4, 5, 6], 8=>[7, 8, 9], 9=>10, 7=>[11, 12, 13]}

It fails with duplicated element not adjacent, of course.

This is the expanded version, step by step, to show how it works.

The basic idea is to build a temporary array with pairs of value and index, then work on it.

array = [1,3,4,5,6,6,6]

tmp_array = []
array.each_with_index do |val, idx|
  tmp_array << [val, idx]
end
p tmp_array #=> [[1, 0], [3, 1], [4, 2], [5, 3], [6, 4], [6, 5], [6, 6]]

tmp_hash = tmp_array.group_by { |e| e[0] }
p tmp_hash #=> {1=>[[1, 0]], 3=>[[3, 1]], 4=>[[4, 2]], 5=>[[5, 3]], 6=>[[6, 4], [6, 5], [6, 6]]}

hash = {}
tmp_hash.map do |k, v|
  hash[k] = v[0][0] if v.size == 1
  hash[k] = v.map {|e| e[1]} if v.size > 1
end

p hash #=> {1=>1, 3=>3, 4=>4, 5=>5, 6=>[4, 5, 6]}

It can be written as one line as:

hash = {}
array.map.with_index.group_by(&:first).map { |k, v| v.size == 1 ? hash[k] = v[0][1] : hash[k] = v.map(&:last) }
p hash

@AndrewMarshall, yes. Why? Actually it is not needed. I updated. — iGian, May 29 '18 at 01:31

Cary Swoveland · Answer 3 · 2018-05-29T02:15:09.147

If you are prepared to accept

{ 1=>[0], 3=>[1], 4=>[2], 5=>[3], 6=>[4,5,6] }

as the return value you may write the following.

array.each_with_index.group_by(&:first).transform_values { |v| v.map(&:last) }
  #=> {1=>[0], 3=>[1], 4=>[2], 5=>[3], 6=>[4, 5, 6]}

The first step in this calculation is the following.

array.each_with_index.group_by(&:first)
  #=> {1=>[[1, 0]], 3=>[[3, 1]], 4=>[[4, 2]], 5=>[[5, 3]], 6=>[[6, 4], [6, 5], [6, 6]]}

This may help readers to follow the subsequent calculations.

I think you will find this return value generally more convenient to use than the one given in the question.

Here are a couple of examples where it's clearly preferable for all values to be arrays. Let:

h_orig = { 1=>0,   3=>1,   4=>2,   5=>3,   6=>[4,5,6] }
h_mod    { 1=>[0], 3=>[1], 4=>[2], 5=>[3], 6=>[4,5,6] }

Create a hash h whose keys are unique elements of array and whose values are the numbers of times the key appears in the array

h_mod.transform_values(&:count)
  #=> {1=>1, 3=>1, 4=>1, 5=>1, 6=>3}
h_orig.transform_values { |v| v.is_a?(Array) ? v.count : 1 }

Create a hash h whose keys are unique elements of array and whose values equal the index of the first instance of the element in the array.

h_mod.transform_values(&:min)
  #=> {1=>0, 3=>1, 4=>2, 5=>3, 6=>4}
h_orig.transform_values { |v| v.is_a?(Array) ? v.min : v }

In these examples, given h_orig, we could alternatively convert values that are indices to arrays containing a single index.

h_orig.transform_values { |v| [*v].count }
h_orig.transform_values { |v| [*v].min }

This is hardly proof that it is generally more convenient for all values to be arrays, but that has been my experience and the experience of many others.

Thanks for the detailed answer – Rubee May 29 '18 at 19:39 — Rubee, May 29 '18 at 19:39

Create a hash out of an array where the values are the indices of the elements

3 Answers3