I actually can imagine situations where I'd rather use @kontr0l approach than something else, but you have to understand that this approach is quadratic, so basically this code is an abstraction for naïve approach - iterate through all values in two arrays.
There are approaches better than quadratic, I won't use here any big O notation, but here are two main approaches, both are better then naïve one:
- iterate through one of the arrays and check for existence in sorted second array using binary search.
- put values into set/hash/dictionary/you name it.
As it've been already mentioned, first approach can be adopted for objects if you reimplement standard difference
method with using some more flexible analogue of indexOf
method.
With second approach we can hit the wall with the fact that, as of Feb'2015, only modern browsers are supporting Sets. As of hashes (well, objects) in javascript, they can have only string-type keys, so any object invoked as key first shoud be converted via toString
method. So, we need to provide some => correspondece. On practice in most cases it's pretty straightforward, for instance, for your particular example such correspondence can be just String(obj.id)
.
Having such correspondence, we also can use following lodas/undercore approach:
var idsA = _.pluck(a, 'id');
var idsB = _.pluck(b, 'id');
// actually here we can stop in some cases, because
// quite often we need to identify object, but not the object itself -
// for instance to send some ids through remote API.
var intersect = _.intersection(idsA, idsB);
//to be 100% sure you get the idea, here we assume that object having equal ids are treated as equal, so does not really matter which of arrays we'll iterate:
var dictA = _.object(idsA, a); // now we can find a by id faster then with _.find
var intersectObj = intersect.map(function(id) {return dictA[id})
But buy admitting slightly stricter restriction - that we can build correspondence between our set objects and natural numbers we can build even more efficent algorithm, i.e. all our ids are non-negative integers - we can use more efficient algorithm.
The trick is to implement set by introducing two helper arrays this way:
var naturalSet = function (arr) {
var sparse = [];
var dense = [];
var contains = function (i) {
var res = sparse[i] < dense.length && dense[sparse[i]] == i;
return res;
}
var add = function (v) {
if (!contains(v)) {
sparse[v] = dense.length;
dense.push(v);
}
}
arr.forEach(add);
return {
contains: contains,
toArray: function () {
return dense
},
_getDense: function () {
return dense
},
_getSparse: function () {
return sparse
}
}
}
Then we can introduce set with mapping to naturalSet:
var set = function (arr, valueOf) {
var natSet = naturalSet(arr.map(valueOf));
return {
contains: function (item) {
return natSet.contains(valueOf(item))
},
toArray: function () {
var sparse = natSet._getSparse();
var res = natSet._getDense().map(function (i) {
return arr[sparse[i]];
});
return res;
}
}
}
and finally, we can introduce intersection:
var intersection = function(arr1, arr2, valueOf) {
return set(arr2.filter(set(arr1, valueOf).contains), valueOf).toArray();
}
So, relying on the structure of data you are working can help you sometimes.