The difference of two sets, A and B, is defined as the set of all those elements of A which are not in B. If we implement it naively, computing the difference of two sets of sizes m
and n
respectively would take O(m * n)
time. Not very efficient:
const john1 = { name: "John Smith", age: 23 };
const john2 = { name: "John Smith", age: 23 };
const mary = { name: "Mary Key", age: 18 };
const bob = { name: "Bob-small", age: 6 };
const people1 = [john1, mary, bob];
const people2 = [john2];
const eqPerson = (p, q) => p.name === q.name && p.age === q.age;
const result = people1.filter(p => people2.every(q => !eqPerson(p, q)));
console.log(result); // [mary, bob]
Fortunately, there's a faster way to compute the set difference for large sets using hashing.
const john1 = { name: "John Smith", age: 23 };
const john2 = { name: "John Smith", age: 23 };
const mary = { name: "Mary Key", age: 18 };
const bob = { name: "Bob-small", age: 6 };
const people1 = [john1, mary, bob];
const people2 = [john2];
const hashPerson = ({ name, age }) => `${name} ${age}`;
const hashSet = new Set(people2.map(hashPerson));
const result = people1.filter(p => !hashSet.has(hashPerson(p)));
console.log(result); // [mary, bob]
The advantage is that creating a hash set takes O(n)
time and calculating the difference takes O(m)
time. Hence in total it only takes O(m + n)
time instead of O(m * n)
time. In addition, you can reuse the hash set in the future.