0

I'm solving the problem of finding the common numbers in two arrays.

My first approach was:

    const findCommonNumbers = (array1,array2) => {

    const map = array1.reduce((accum, element)=>{
        accum[element] = true;
        return accum;
    } , {});

    const result = [];
    array2.forEach((e)=>{
        if (map[e]) {
            result.push(e);
        }
    });
    return result;
}

const array1 = [4, 2, 3, 6, 23];
const array2 = [3, 2, 4, 7 ,9];



const result = findCommonNumbers(array1, array2 );
console.log('result', result);

But then, looking at the solution, the most efficient way seems to be:

const findCommonNumbers = (array1, array2) => {

    const result = [];
    let p1 = 0;
    let p2 = 0;
    while (p1 < array1.length  && p2 < array2.length) {
        const e1 = array1[p1];
        const e2 = array2[p2];

        if (e1 === e2) {
            result.push(e1);
            p1++;
            p2++;
        }
        else if (e1 < e2) {
            p1++;
        }
        else {
            p2++;
        }
    }

    return result;
}

const array1 = [2, 3, 4, 6, 23];
const array2 = [2, 3, 4, 7, 9];


const result = findCommonNumbers(array1, array2);
console.log('result', result);

But, I see that in both cases, both arrays are iterated once, besides, my initial solution also works for unsorted arrays. What would be the complexity of both algorithms, and why? Is the proposed solution more efficient just by the fact that both arrays are iterated in the same loop, while in my case I iterate them sequentially?

fgonzalez
  • 3,787
  • 7
  • 45
  • 79
  • Hashtable operations are average O(1) but worst case O(n). This makes your algorithm worst case O(n²+m*n) while the other is O(n+m). Unless you are worried about collision attacks on your script it doesn't matter all that much though, working on unsorted arrays is a way bigger benefit than the weird case that you get worst case hashtable collisions. – ASDFGerte Mar 15 '18 at 19:03
  • Is one of the solutions causing a bottle neck in the data flow? If not, all you do here is waisting your time with microoptimization that doesn't help anyone – baao Mar 15 '18 at 19:22
  • 1
    How about `(a, b) => (s => a.filter(x => s.has(x)))(new Set(b))` ;) – georg Mar 15 '18 at 19:49
  • How is that different to array.includes? It is a different data structure, but even if it is a set, you are still iterating the Set for every element on a, to see if the element is contained in b. Or please tell me what's the difference – fgonzalez Mar 15 '18 at 20:18
  • @fgonzalez: unlike arrays, `Set` lookups are supposed to be O(1). – georg Mar 15 '18 at 20:37

1 Answers1

0

One line solution:

a1.filter(n => a2.includes(n));

Or wrapping in function:

const findCommonNumbers = (a1, a2) => a1.filter(n => a2.includes(n));

Or final solution when we want pass 'n' arrays and find common numbers

const findCommonNumbers = (...a) => [...a].reduce((p,c) => p.filter(e => c.includes(e)));

Performance

I run simple benchmark to compare performance (i5-7500 3,4GHz, 16GB ram, windows 10, NodeJS v9.8.0)

smallArray length = 10;
bigArray length = 10000;

and some results:

CASE1: (a1, a2) => a1.filter(n => a2.includes(n)), INPUT(smallArr,BigArr)
1: 0.892ms
CASE2: (a1, a2) => a1.filter(n => a2.includes(n)), INPUT(bigArr,smallArr)
2: 2.088ms
CASE3: (...a) => [...a].reduce((p,c) => p.filter(e => c.includes(e))), INPUT(smallArr,BigArr)
3: 0.968ms
CASE4: (...a) => [...a].reduce((p,c) => p.filter(e => c.includes(e))), INPUT(bigArr,smallArr)
4: 1.712ms
CASE5: fgonzalez code, INPUT(smallArr,BigArr)
5: 1.090ms
CASE6: fgonzalez code, INPUT(bigArr,smallArr)
6: 0.989ms


CASE1: (a1, a2) => a1.filter(n => a2.includes(n)), INPUT(smallArr,BigArr)
1: 0.908ms
CASE2: (a1, a2) => a1.filter(n => a2.includes(n)), INPUT(bigArr,smallArr)
2: 2.227ms
CASE3: (...a) => [...a].reduce((p,c) => p.filter(e => c.includes(e))), INPUT(smallArr,BigArr)
3: 0.968ms
CASE4: (...a) => [...a].reduce((p,c) => p.filter(e => c.includes(e))), INPUT(bigArr,smallArr)
4: 1.588ms
CASE5: fgonzalez code, INPUT(smallArr,BigArr)
5: 1.058ms
CASE6: fgonzalez code, INPUT(bigArr,smallArr)
6: 0.987ms


CASE1: (a1, a2) => a1.filter(n => a2.includes(n)), INPUT(smallArr,BigArr)
1: 1.014ms
CASE2: (a1, a2) => a1.filter(n => a2.includes(n)), INPUT(bigArr,smallArr)
2: 2.167ms
CASE3: (...a) => [...a].reduce((p,c) => p.filter(e => c.includes(e))), INPUT(smallArr,BigArr)
3: 0.991ms
CASE4: (...a) => [...a].reduce((p,c) => p.filter(e => c.includes(e))), INPUT(bigArr,smallArr)
4: 1.642ms
CASE5: fgonzalez code, INPUT(smallArr,BigArr)
5: 1.161ms
CASE6: fgonzalez code, INPUT(bigArr,smallArr)
6: 1.063ms


CASE1: (a1, a2) => a1.filter(n => a2.includes(n)), INPUT(smallArr,BigArr)
1: 0.941ms
CASE2: (a1, a2) => a1.filter(n => a2.includes(n)), INPUT(bigArr,smallArr)
2: 2.262ms
CASE3: (...a) => [...a].reduce((p,c) => p.filter(e => c.includes(e))), INPUT(smallArr,BigArr)
3: 0.989ms
CASE4: (...a) => [...a].reduce((p,c) => p.filter(e => c.includes(e))), INPUT(bigArr,smallArr)
4: 1.603ms
CASE5: fgonzalez code, INPUT(smallArr,BigArr)
5: 1.080ms
CASE6: fgonzalez code, INPUT(bigArr,smallArr)
6: 1.178ms

My code is slightly faster when smallArr is the first arg. @fgonzalez code is 2x faster when bigArr is the first arg.

Artur P.
  • 886
  • 4
  • 12
  • 2
    This might be the shortest solution, but it is not the most efficient one, right? This solutions implies for every element in a1, we need to fully iterate a2, to see if the element is included. In my solution, I only iterate both arrays only once. – fgonzalez Mar 15 '18 at 19:08
  • in my second function with arguments `(a1, a2)` we only fully iterate over a1 array and checking if a2 include a1 element. a2 array its not being iterated. – Artur P. Mar 15 '18 at 19:17
  • 2
    `includes` iterates every time. This should be O(n*m) as has already been noted. – ASDFGerte Mar 15 '18 at 19:19
  • 1
    I don't get it, how can you know if an element is included in an array if you don't fully iterate it? The fact that you use a function to do it, it does not mean that does not make the iteration – fgonzalez Mar 15 '18 at 19:20
  • On sorted arrays you could use e.g. binary search. For unsorted arrays you need to iterate every element (sidenote: [unless you are on a quantum computer](https://en.wikipedia.org/wiki/Grover%27s_algorithm), try to get your head around that...) – ASDFGerte Mar 15 '18 at 19:24
  • 1
    I was wrong, sorry. array.includes performs loop: https://developer.mozilla.org/pl/docs/Web/JavaScript/Referencje/Obiekty/Array/includes – Artur P. Mar 15 '18 at 19:28