2

I want an array that contains objects from the scrape object that are not present in the old object. The arrays I'm actually working with contains nearly 100 objects.

The code below works, but I wonder if there's a more efficient way of getting the same result?

var old = [
  {a: 6, b: 3},
  {a: 1, b: 1}, 
  {a: 3, b: 3}
]

var scrape = [
  {a: 1, b: 1}, 
  {a: 5, b:5}
]

var nogood = []
var good =[]

scrape.forEach(es => {
  old.forEach(e => {
    if(e.a == es.a) {
      nogood.push(es)
    }
  })
})
console.log(nogood)

nogood.forEach(main =>   
  good = scrape.filter(e=> e.a!=main.a)  
)
console.log(good)

This is what I expect and what I'm getting:

good = {a:5, b:5}
mac
  • 318
  • 4
  • 16

4 Answers4

1

Personally I would approach this with:

const old = [
  {a: 6, b: 3},
  {a: 1, b: 1}, 
  {a: 3, b: 3}
];

const scrape = [{a: 1, b: 1}, {a: 5, b:5}];

for (const item of old) {
  for (const i in scrape) {
    if (JSON.stringify(item) === JSON.stringify(scrape[i])) {
      scrape.splice(i, 1); //delete the previously scraped item
    }
  }
} 

console.log(scrape); //{a: 5, b:5}

The benefits to this approach are:

  • You don't care what properties the objects you're comparing have, you just care about whether they're identical.
  • It's fast (comparing JSON is generally faster than traversing the objects to compare each property).
  • It's more succinct to splice the scrape array rather than adding the 'good' and 'nogood' arrays to arrive at a filtered scrape array.

Possible deal breaker is if the objects you're comparing contain methods, in which case comparing them via JSON is not the correct approach.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify

JohnnyFaldo
  • 4,121
  • 4
  • 19
  • 29
1

If we have arrays old and scrape to be of size M and N, respectively, all traditional approaches has the complexity of O(M * N) because you need to compare each entry within array scrape with the ones exists in array old to find out whether matches or not.

The second and more efficient approach is to create a hash table on first array, typically on bigger one (old here), and iterate over the second one (scrape here) which has the complexity of O(M + N).

If the size of M and N be as big as enough, the differences show themselves. As an example if M=100 and N=200, the former one needs to compare 20000 objects but the later one needs just 300 comparisons.

please take a look at this code:

const old = [
  {a: 6, b: 3},
  {a: 1, b: 1},
  {a: 3, b: 3}
]

const scrape = [
  {a: 1, b: 1},
  {a: 5, b:5}
]

// create hash map using built-in Javascript Map
const pair = old.map(x => [JSON.stringify(x), true])
const map = new Map(pair)

// filter those object does not exist in hash map
const good = scrape.filter(x => !map.has(JSON.stringify(x)))
console.log(good)
agtabesh
  • 548
  • 3
  • 15
  • thanks! I tried your example without the hash map, and it seems to work as well, unless I'm overlooking something? `var good; var old1 = old.map(e => JSON.stringify(e)); good = scrape.filter(e=> !old1.includes(JSON.stringify(e))); console.log(good);` – mac Jul 12 '19 at 21:31
  • @mac It's ok but in large arrays, it takes time because it has complexity of `O(MN)` as described above. Using hash map is efficient way. however, what you have done is also works. – agtabesh Jul 13 '19 at 02:57
0

How about something like this?

const good = scrape.filter((sEl) => {
  return !old.some(oEl => oEl.a === sEl.a);
})

This avoids the nested forEach loops and .some will return as soon as a single true condition is found, avoiding some excess searching when an element exists early in the 'old' array.

Jeff Hechler
  • 285
  • 3
  • 16
0

May be something like:

var old = [
  {a: 6, b: 3},
  {a: 1, b: 1}, 
  {a: 3, b: 3}
]

var scrape = [
  {a: 1, b: 1}, 
  {a: 5, b:5}
]

var result = scrape.filter(s => old.findIndex(o => o.a === s.a) === -1);
console.log(result);
Vishnu
  • 897
  • 6
  • 13