18

I have two arrays of objects:

var a = [  {'id': 20},   {'id': 15},   {'id': 10},   {'id': 17},   {'id': 23}  ];

var b = [ {'id': 90},   {'id': 15},    {'id': 17},   {'id': 23}  ];  

I'd like to get objects which are in a, but not in b. Results from this example would be:

{'id': 20} and {'id': 10}.

Because the arrays could be large, I need an efficient way to do this.

beatgammit
  • 19,817
  • 19
  • 86
  • 129
wong2
  • 34,358
  • 48
  • 134
  • 179
  • When you say you want the _objects_ in A but not in B, do you mean the object (references) themselves, or does value comparison work for you? Strictly speaking if your example were real JavaScript and not pseudo code, all of the objects in A are not in B! – Ray Toal Jul 16 '11 at 05:56
  • @Ray Toal If they have same "id" they are same object – wong2 Jul 16 '11 at 06:00
  • 1
    Ah, okay then a linear time solution exists. Put value of `a` in a set `x` (in JS, the keys of an object). Ditto for `b`. For each key in `y`, remove it from `x`. Then `x` will have all the integers that appear as values in `a` that are not in `b`. You may need another step to extract corresponding objects from `a`... Have fun implementing. :) – Ray Toal Jul 16 '11 at 06:07

5 Answers5

21
// Make hashtable of ids in B
var bIds = {}
b.forEach(function(obj){
    bIds[obj.id] = obj;
});

// Return all elements in A, unless in B
return a.filter(function(obj){
    return !(obj.id in bIds);
});

very minor addendum: If the lists are very large and you wish to avoid the factor of 2 extra memory, you could store the objects in a hashmap in the first place instead of using lists, assuming the ids are unique: a = {20:{etc:...}, 15:{etc:...}, 10:{etc:...}, 17:{etc:...}, 23:{etc:...}}. I'd personally do this. Alternatively: Secondly, javascript sorts lists in-place so it doesn't use more memory. e.g. a.sort((x,y)=>x.id-y.id) Sorting would be worse than the above because it's O(N log(N)). But if you had to sort it anyway, there is an O(N) algorithm that involves two sorted lists: namely, you consider both lists together, and repeatedly take the leftmost (smallest) element from the lists (that is examine, then increment a pointer/bookmark from the list you took). This is just like merge sort but with a little bit more care to find identical items... and maybe pesky to code. Thirdly, if the lists are legacy code and you want to convert it to a hashmap without memory overhead, you can also do so element-by-element by repeatedly popping the elements off of the lists and into hashmaps.

ninjagecko
  • 88,546
  • 24
  • 137
  • 145
  • @Ray: Unfortunately it's not truly entirely functional programming; I cheat and make mutations in the `forEach`. =( In order to sanely do real functional programming, we'd need a functional datastructure that allowed one to sum two key-sets/objects without making a copy of the whole thing, and to use `Array.reduce(function(obj,accum){...}, {})` or some variant. We could also abstract away the "evil" mutation into `Array.prototype.toObject = function(callback){var R={};this.forEach(function(elem){var r=callback(elem);R[r[0]]=r[1];});return R;}` and pretend it's a functional primitive. – ninjagecko Jul 16 '11 at 07:53
  • addendum demo: `[1,2,3].toObject(function(x){return [x,10*x]})` --> `{1:10, 2:20, 3:30}` – ninjagecko Jul 16 '11 at 07:56
  • True, you are using state, my bad. Here's my stateless solution: `let bIds = map(b, (k,v) => v) in filter(a, (x) => x not in bids)` – Ray Toal Jul 16 '11 at 08:03
  • You can improve a bit by replacing the first step with: `bIds = b.map(function(user) { return b.id });` Then in the second step, you can just compare the value of Ids. – Ethan Yang May 13 '16 at 07:35
  • @EthanYang: you can't; I assume you mean `user.id` rather than `b.id`. That will give a list `ids = [90, 15, 17, 23]`. Not only do lists not support `90 in ids` (is `false`; you would need `ids.includes(90)`), but such a lookup is O(N) time rather than O(1) time because you need to look through each element of the list. Using such a lookup would thus make the entire process take O(N^2) time, which would make things impossible for large lists. That is the point of using a js {} object as the standard and common 'hashmap'/'hashtable' data structure. see http://stackoverflow.com/a/6620071/711085 – ninjagecko May 29 '16 at 08:15
  • What is time complexity and space complexity of above code? Please let me know. – Sunny Aug 30 '17 at 16:19
14

With lodash 4.12.0 you can use _.differenceBy.

_.differenceBy(a, b, 'id');
Guillaume Vincent
  • 13,355
  • 13
  • 76
  • 103
Ethan Yang
  • 862
  • 9
  • 20
  • 4
    Because the original answer made no notion of lodash, so your answer does not provide any information to solve this problem. It would be like asking "How do I plant an apple tree?" and getting the answer "You can buy apples in a store." – Jonathan Oct 10 '16 at 12:36
  • Then why [the above answer](http://stackoverflow.com/a/6715844/3291674) can answer with `underscore.js`? – Ethan Yang Nov 09 '16 at 05:04
  • 1
    You have posted your answer too, and noone censored it. You asked why it was downvoted and I gave you a possible answer. As you see, the other answer has not gathered much votes too, nor was it accepted. – Jonathan Nov 09 '16 at 09:23
  • @Jonathan it's good to provide alternative because not all people coming here from google know about lodash, maybe even OP doesn't know about lodash. – cikatomo Dec 26 '22 at 17:15
2

A general way to do this would be:

  1. put all objects from b into a hashtable
  2. iterate over a, for each item check if it is in the hashtable

A lot of programming environments have set and/or HashSet implementations these days, which make it very simple to do this.

In special cases, other ways might be more efficient. If, for example, your elements were byte-sized values, and a and b both fairly big, then I would use a boolean array "flags" with 256 elements, initialize all to false. Then, for each element x of b, set flags[x] to true. Then iterate over a, and for each y in a, check if flags[y] is set.

user816098
  • 262
  • 3
  • 14
0

If you not adverse to including a library use underscore.js it has a good intersection function http://documentcloud.github.com/underscore/

Michal
  • 13,439
  • 3
  • 35
  • 33
0

The accepted answer works if we just want to compare the id. If you want to compare the whole object, we can use:

_.differenceBy(a, b, JSON.stringify);

Jianwu Chen
  • 5,336
  • 3
  • 30
  • 35