If I understand the question correctly, you could do the following.
Code
def count_shared(arr1, arr2)
arr1.group_by(&:itself).
merge(arr2.group_by(&:itself)) { |_,ov,nv| [ov.size, nv.size].min }.
values.
reduce(0) { |t,o| (o.is_a? Array) ? t : t + o }
end
Examples
arr1 = ["B","A","A","A","B"]
arr2 = ["A","B","A","B","B"]
count_shared(arr1, arr2)
#=> 4 (2 A's + 2 B's)
arr1 = ["B", "A", "C", "C", "A", "A", "B", "D", "E", "A"]
arr2 = ["C", "D", "F", "F", "A", "B", "A", "B", "B", "G"]
count_shared(arr1, arr2)
#=> 6 (2 A's + 2 B's + 1 C + 1 D + 0 E's + 0 F's + 0 G's)
Explanation
The steps are as follows for a slightly modified version of the first example.
arr1 = ["B","A","A","A","B","C","C"]
arr2 = ["A","B","A","B","B","D"]
First apply Enumerable#group_by to both arr1
and arr2
:
h0 = arr1.group_by(&:itself)
#=> {"B"=>["B", "B"], "A"=>["A", "A", "A"], "C"=>["C", "C"]}
h1 = arr2.group_by(&:itself)
#=> {"A"=>["A", "A"], "B"=>["B", "B", "B"], "D"=>["D"]}
Prior to Ruby v.2.2, when Object#itself was introduced, you would have to write:
arr.group_by { |e| e }
Continuing,
h2 = h0.merge(h1) { |_,ov,nv| [ov.size, nv.size].min }
#=> {"B"=>2, "A"=>2, "C"=>["C", "C"], "D"=>["D"]}
I will return shortly to explain the above calculation.
a = h2.values
#=> [2, 2, ["C", "C"], ["D"]]
a.reduce(0) { |t,o| (o.is_a? Array) ? t : t + o }
#=> 4
Here Enumerable#reduce (aka inject
) merely sums the values of a
that are not arrays. The arrays correspond to elements of arr1
that do not appear in arr2
or vise-versa.
As promised, I will now explain how h2
is computed. I've used the form of Hash#merge that employs a block (here { |k,ov,nv| [ov.size, nv.size].min }
) to compute the values of keys that are present in both hashes being merged. For example, when the first key-value pair of h1
("A"=>["A", "A"]
) is being merged into h0
, since h0
also has a key "A"
, the array
["A", ["A", "A", "A"], ["A", "A"]]
is passed to the block and the three block variables are assigned values (using "parallel assignment", which is sometimes called "multiple assignment"):
k, ov, nv = ["A", ["A", "A", "A"], ["A", "A"]]
so we have
k #=> "A"
ov #=> ["A", "A", "A"]
nv #=> ["A", "A"]
k
is the key, ov
("old value") is the value of "A"
in h0
and nv
("new value") is the value of "A"
in h1
. The block calculation is
[ov.size, nv.size].min
#=> [3,2].min = 2
so the value of "A"
is now 2
.
Notice that the key, k
, is not used in the block calculation (which is very common when using this form of merge
). For that reason I've changed the block variable from k
to _
(a legitimate local variable), both to reduce the chance of introducing a bug and to signal to the reader that the key is not used in the block. The other elements of h2
that use this block are computed similarly.
Another way
It would be quite simple if we had available an Array
method I've proposed be added to the Ruby core:
array_a = ["B","A","A","A","B"]
array_b = ["A","B","A","B","B"]
array_a.size - (array_a.difference(array_b)).size
#=> 4
or
array_a.size - (array_b.difference(array_a)).size
#=> 4
I've cited other applications in my answer here.