1

I wrote a couple of methods to add items to an array in such a manner that if they were already in the array, they would be ignored. After doing some research on Data Structures, I realized I can get rid of the duplicates by simply putting them in a set (especially because I don't care about the order of the objects). However, after playing with JavaScript, I noticed that identical objects are added to a set, example:

var mySet = new Set();
mySet.add({a:23})
console.log(mySet);
mySet.add({b:14})
console.log(mySet);
mySet.add({a:23})
console.log(mySet);

Output:

Set {Object {a: 23}}
Set {Object {a: 23}, Object {b: 14}}
Set {Object {a: 23}, Object {b: 14}, Object {a: 23}}

In this example, I add object with key->a and value->23, then another one with a completely different key,value set and then the original one again. I would expect to only have a->23 and b->14 in my set.

What's the best way to get rid of duplicates in a list?

Derek 朕會功夫
  • 92,235
  • 44
  • 185
  • 247
H. Trujillo
  • 427
  • 1
  • 8
  • 21
  • 2
    Simple example, type in the console `{a: 23} == {a: 23}`. – kind user May 01 '17 at 15:55
  • I like to utilize `JSON.stringify()` as the key in an "associative array", but I'm not sure if that's the "best" way – Patrick Barr May 01 '17 at 15:57
  • 1
    Here's your problem, you think that they are "identical objects", but in fact they are different references pointing to different objects. Before answer the question on how to remove duplicates, you first have to define what "duplicate" means. Objects with the same set of properties? What about objects with the same set of properties but with different prototypes? Are they "duplicates"? – Derek 朕會功夫 May 01 '17 at 16:02
  • 2
    related? http://stackoverflow.com/questions/2218999/remove-duplicates-from-an-array-of-objects-in-javascript – Kevin B May 01 '17 at 16:05
  • The point is that they are not *identical*, i.e. share the *same* identity. They might look equal, and have the same property values, but they are distinct objects. – Bergi May 01 '17 at 16:18
  • I think for my implementation using a Map would make the most sense as what I am trying to do is to get rid of duplicate data without having to write methods to clean it. Thanks @dashton! – H. Trujillo May 01 '17 at 17:22

2 Answers2

4

Those objects, although having the same structure, are distinct. Each represents a separate space in memory and is a unique reference to that memory.

console.log(
  { a: 23 } === { a: 23 } // false
);

This is how you would add the same object to a set twice and the second addition would get ignored.

var obj = { a: 23 };
var s = new Set();

// first addition
s.add(obj);
console.log(
  ...s.values()
);

// second addition of the same object is ignored
s.add(obj);
console.log(
  ...s.values()
);

If you want to support your use case of comparing objects based on the data they contain instead of reference equality, what you might want to look into is deep object comparison.

For example, here's one way of utilizing the deep-equal library to check if an object exists in an array before adding it:

!function(e){if("object"==typeof exports&&"undefined"!=typeof module)module.exports=e();else if("function"==typeof define&&define.amd)define([],e);else{var t;t="undefined"!=typeof window?window:"undefined"!=typeof global?global:"undefined"!=typeof self?self:this,t.deepEqual=e()}}(function(){return function e(t,r,n){function o(u,i){if(!r[u]){if(!t[u]){var c="function"==typeof require&&require;if(!i&&c)return c(u,!0);if(f)return f(u,!0);var p=new Error("Cannot find module '"+u+"'");throw p.code="MODULE_NOT_FOUND",p}var l=r[u]={exports:{}};t[u][0].call(l.exports,function(e){var r=t[u][1][e];return o(r?r:e)},l,l.exports,e,t,r,n)}return r[u].exports}for(var f="function"==typeof require&&require,u=0;u<n.length;u++)o(n[u]);return o}({1:[function(e,t,r){function n(e){return null===e||void 0===e}function o(e){return!(!e||"object"!=typeof e||"number"!=typeof e.length)&&("function"==typeof e.copy&&"function"==typeof e.slice&&!(e.length>0&&"number"!=typeof e[0]))}function f(e,t,r){var f,l;if(n(e)||n(t))return!1;if(e.prototype!==t.prototype)return!1;if(c(e))return!!c(t)&&(e=u.call(e),t=u.call(t),p(e,t,r));if(o(e)){if(!o(t))return!1;if(e.length!==t.length)return!1;for(f=0;f<e.length;f++)if(e[f]!==t[f])return!1;return!0}try{var s=i(e),a=i(t)}catch(e){return!1}if(s.length!=a.length)return!1;for(s.sort(),a.sort(),f=s.length-1;f>=0;f--)if(s[f]!=a[f])return!1;for(f=s.length-1;f>=0;f--)if(l=s[f],!p(e[l],t[l],r))return!1;return typeof e==typeof t}var u=Array.prototype.slice,i=e("./lib/keys.js"),c=e("./lib/is_arguments.js"),p=t.exports=function(e,t,r){return r||(r={}),e===t||(e instanceof Date&&t instanceof Date?e.getTime()===t.getTime():!e||!t||"object"!=typeof e&&"object"!=typeof t?r.strict?e===t:e==t:f(e,t,r))}},{"./lib/is_arguments.js":2,"./lib/keys.js":3}],2:[function(e,t,r){function n(e){return"[object Arguments]"==Object.prototype.toString.call(e)}function o(e){return e&&"object"==typeof e&&"number"==typeof e.length&&Object.prototype.hasOwnProperty.call(e,"callee")&&!Object.prototype.propertyIsEnumerable.call(e,"callee")||!1}var f="[object Arguments]"==function(){return Object.prototype.toString.call(arguments)}();r=t.exports=f?n:o,r.supported=n,r.unsupported=o},{}],3:[function(e,t,r){function n(e){var t=[];for(var r in e)t.push(r);return t}r=t.exports="function"==typeof Object.keys?Object.keys:n,r.shim=n},{}]},{},[1])(1)});

function addToSet(s, value) {
  for (const curr of s.values()) {
    if (deepEqual(curr, value)) {
      return;
    }
  }
  s.add(value);
}

var mySet = new Set();

// first addition works
addToSet(mySet, {a:23});
console.log(...mySet.values());

// second addition works
addToSet(mySet, {b:14});
console.log(...mySet.values());

// addition of an object of the same structure is ignored
addToSet(mySet, {a:23});
console.log(...mySet.values());

Important consideration in this case is the time/space complexity of the deepEqual function. With this approach, the time complexity for adding an element to a set increases from linear to at least quadratic, i.e. O(setSize * numberOfPropsInLargestObject). Since most versions of deep object comparison methods are written in a recursive fashion, assuming it's not a tail recursive function in a modern environment, this also increases space complexity from constant to linear.

nem035
  • 34,790
  • 6
  • 87
  • 99
  • 2
    @Bergi not sure. It would've been useful if they at least gave feedback to whatever they think is wrong. – nem035 May 01 '17 at 16:26
  • 1
    Thanks @nem035 , if someone HAD to use a Set, your answer would've been perfect, but for the sake of simplicity I think a Map makes more sense for my application. Also not sure why anyone would down vote your answer. – H. Trujillo May 01 '17 at 17:24
  • I've removed my downvote. The first revision just answered about time complexity. – mehulmpt May 02 '17 at 13:03
  • @MehulMohan Thanks. Btw I added time complexity as the last thing :). The [first revision](https://stackoverflow.com/revisions/43722287/1) just mentioned that `===` comparison is by reference, not by structure and that OP should use something like a deep-equal comparison. – nem035 May 02 '17 at 14:42
3

As everyone else has said, the objects aren't equal, so they appear to be duplicated. If you can specify a key, which I'm assuming is a,b,c etc, then you can use a map. It is iterable too.

var myMap = new Map();
myMap.set('a',23);
myMap.set('b',123);
myMap.set('a',33);

console.log(myMap)

Map {"a" => 33, "b" => 123} (2)

dashton
  • 2,684
  • 1
  • 18
  • 15