Let S
be a set of arrays, we want to find the largest subset of values that appear in each of the arrays in S
. values may appear more than once, if a value appears atleast x
times in each of the arrays, it should appear x
times in the output set.
This problem can be seen as finding the kernel of S
.
For the sake of complexity computations, assume S
contains n
arrays, all of them are of size m
.
Example:
A = {4, 3, 8, 12, 5, 4, 4, 4}
B = {5, 4, 12, 1, 1, 4, 1, 1}
C = {2, 11, 4, 8, 4, 5, 2, 8}
Answer = {4, 4, 5} // order doesn't matter
A couple of remarks:
- If the values would have been unique in each array, we could have used a hashtable to store all the values, then count the number of appearances of each of them, if it number of appearances was equal to the number of arrays in
S
, it would enter the answer. - If we would want the unique values set (i.e. in the example the answer would have been
{4, 5}
), we could have used a hashtable for each array to check if a value was already added to the "big" hashtable (from the previous bullet) to prevent double insertion of a value that appeared more than once in an array (or any other prevention of duplicate insertions).
Any suggestions of an efficient (time and memory) algorithm to perform this task?