1

I am working on a scraping script and found about Set object which should store unique datas and fast in performance. So I tried it like this

let scrapedMessages = new Set()

scrapedMessages.add({
                text,
                ...(images.length > 0 && {
                  images,
                }),
                senderID,
                timestamp,
              })

But when looking at the data scraped I found out these type of duplicate datas

  {
    "text": "Acne Fighting Facial Wash With Jojoba Beads",
    "senderID": "361571627329333",
    "timestamp": "1613017270619"
  },
  {
    "text": "Acne Fighting Facial Wash With Jojoba Beads",
    "senderID": "361571627329333",
    "timestamp": "1613017270619"
  }

Does it mean Set of Objects might not be unique or I am doing some mistakes. I was doing it through simple array. But shifted to Set type for better performance. Can it be achieved? What is the best practice?

I am running the puppeteer script on Nodejs.

coolsaint
  • 1,291
  • 2
  • 16
  • 27
  • 3
    Objects are not compared by their contents, but by the identify of the object. – Barmar Aug 22 '21 at 14:37
  • 1
    This may help: https://stackoverflow.com/a/29759699/2358409 – uminder Aug 22 '21 at 14:42
  • Does this answer your question? [How to customize object equality for JavaScript Set](https://stackoverflow.com/questions/29759480/how-to-customize-object-equality-for-javascript-set) – ggorlen Aug 22 '21 at 18:39

1 Answers1

1

JS use === operator to compare the new element to all others which already exists in the set and based on that it decides to either add the new element or ignore it.

The problem is, objects are address of memory:

const a = {x: 1};
const b = {x: 1};
const c = a;

console.log(a===b); // false
console.log(a===c); // true
console.log(b===c); // false

So here there are couple of things you can do:

  1. Extend the Set class and override the add method to compare the object's properties instead of using ====.

  2. NOT RECOMMENDED but you can simply use JSON.stringify(object) before adding it.

Amir MB
  • 3,233
  • 2
  • 10
  • 16