4

How do we get intersections and unions in Ruby for sets that repeat elements.

# given the sets
a = ["A", "B", "B", "C", "D", "D"]
b = ["B", "C", "D", "D", "D", "E"]

# A union function that adds repetitions
union(a, b)
=> ["A", "B", "B", "C", "D", "D", "D", "E"]

# An intersection function that adds repetitions
intersection(a, b)
=> ["B", "C", "D", "D"]

The &, and | operators seem to ignore repetitions and duplicates, as written in the documentation.

# union without duplicates
a | b
=> ["A", "B", "C", "D", "E"]

# intersections without duplicates
a & b
=> ["B", "C", "D"]
Stefan
  • 109,145
  • 14
  • 143
  • 218
Amin Shah Gilani
  • 8,675
  • 5
  • 37
  • 79
  • 3
    "...seem to ignore repetitions." Per the Array documentation, [`&`](http://ruby-doc.org/core-2.3.1/Array.html#method-i-26) and [`|`](http://ruby-doc.org/core-2.3.1/Array.html#method-i-7C) do ignore duplicates. – the Tin Man Jun 13 '16 at 23:28
  • In mathematics, a set is a collection of distinct objects. – steenslag Jun 13 '16 at 23:56
  • I presume you want to know how to implement your methods `union` and `intersection`, in which you need to say so. They have nothing to do with `Array#|` or `Array#&`, so it doesn't add anything to refer to them. Examples are fine, but they do not explain the precise behaviour you want your methods to exhibit. We could guess, but might be wrong. You need to describe those behaviours in words; examples are secondary. – Cary Swoveland Jun 14 '16 at 01:20
  • @CarySwoveland better? – Amin Shah Gilani Jun 14 '16 at 01:52
  • You still need the words. My reading, for `union`, is "given arrays `a` and `b`, return an array `c` such that for each distinct element `e` that appears in either `a` or `b` (i.e., in `a|b`), `e` appears `n` times in `c`, where `n` is the maximum of the numbers of times `e` appears in `a` and the number of times `e` appears in `b`". "Changing `maximum` to `minimum` describes the method `intersection`". – Cary Swoveland Jun 14 '16 at 02:46
  • 1
    @CarySwoveland I never thought to look at it that way. I thought the definition of union was well known, but describing it in that way, actually makes it easier to implement, thanks! – Amin Shah Gilani Jun 14 '16 at 10:38
  • That's a good point: once you've expressed the problem precisely, obtaining a solution is often no more than translating your words to code. – Cary Swoveland Jun 14 '16 at 12:31
  • BTW, a set with repeating elements is called a [multiset](https://en.wikipedia.org/wiki/Multiset). – Stefan Jun 15 '16 at 10:13

2 Answers2

3
def union(a,b)
  (a|b).flat_map { |s| [s]*[a.count(s), b.count(s)].max }
end

union(a,b)
  # => ["A", "B", "B", "C", "D", "D", "D", "E"] 

def intersection(a,b)
  (a|b).flat_map { |s| [s]*[a.count(s), b.count(s)].min }
end

intersection(a,b)
  #=> ["B", "C", "D", "D"]
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
1

Building upon Cary Swoveland's answer, you could create a temporary hash to count the number of occurrences of each member in each array: (I've generalized the number of arguments)

def multiplicities(*arrays)
  m = Hash.new { |h, k| h[k] = Array.new(arrays.size, 0) }
  arrays.each_with_index { |ary, idx| ary.each { |x| m[x][idx] += 1 } }
  m
end

multiplicities(a, b)
#=> {"A"=>[1, 0], "B"=>[2, 1], "C"=>[1, 1], "D"=>[2, 3], "E"=>[0, 1]}

Implementing union and intersection is straight forward:

def union(*arrays)
  multiplicities(*arrays).flat_map { |x, m| Array.new(m.max, x) }
end

def intersection(*arrays)
  multiplicities(*arrays).flat_map { |x, m| Array.new(m.min, x) }
end

union(a, b)        #=> ["A", "B", "B", "C", "D", "D", "D", "E"]
intersection(a, b) #=> ["B", "C", "D", "D"]

With this approach each array has to be traversed only once.

Community
  • 1
  • 1
Stefan
  • 109,145
  • 14
  • 143
  • 218