0

I'm using the below code to intersect 2 very large arrays. Is it possible to improve the performance of this as it takes such a long time as it currently stands.

const arr1 = [  
    {  
       "number":"123",
       "firstName":"John",
       "lastName":"Smith",
       "email":"test1@test.com",
    },
    {  
       "number":"1234",
       "firstName":"Chad",
       "lastName":"Baker",
       "email":"test2@test.com",
    }
];
 
const arr2 = [  
    {  
        "number":"12345",
        "firstName":"Chad",
        "lastName":"Baker",
        "email":"test2@test.com",
    },
    {  
        "number":"123456",
        "firstName":"John",
        "lastName":"Smith",
        "email":"test1@test.com",
    }
]

 let arr3 = arr1.filter(a => arr2.some(b => { return a.firstName == b.firstName && a.lastName == b.lastName}));  
 
 console.log(arr3);
juicy89
  • 448
  • 5
  • 14

2 Answers2

5

Create an id based on your matching criterion and add it to a Set for one of your arrays. Then you have a constant lookup time

let createId = (x) => `F:${x.firstName};L:${x.lastName}`;

//ids is a Set<string> which will only contain values created by the createId function
let ids = new Set(arr2.map(x => createId(x)));

//now in the filter, you have a set-lookup which is O(1) instead of 
//a linear search which is O(n)
let intersection = arr1.filter(x => ids.has(createId(x)));

Of course, you can modify the createId however you want. By calling it also in the callback of filter, you make sure, all objects are treated the same.

derpirscher
  • 14,418
  • 3
  • 18
  • 35
2

You can hash the objects for uniqueness, build a Set of hashes (arr2), and check if the item in arr1—when hashed—exists in the set.

// Source: https://stackoverflow.com/a/15710692/1762224
const strHashCode = (s) =>
  s.split('').reduce((a, b) => {
    a = ((a << 5) - a) + b.charCodeAt(0);
    return a & a;
  }, 0);

// Based on Java hash function
const hashName = ({ firstName, lastName }) => {
  let result = 17;
  result = 31 * result + firstName != null ? strHashCode(firstName) : 0;
  result = 31 * result + lastName != null ? strHashCode(lastName) : 0;
  return result;
};

const
  arr1 = [
    { number: "123", firstName: "John", lastName: "Smith", email: "test1@test.com" },
    { number: "1234", firstName: "Chad", lastName: "Baker", email: "test2@test.com" }
  ],
  arr2 = [
    { number: "12345", firstName: "Chad", lastName: "Baker", email: "test2@test.com" },
    { number: "123456", firstName: "John", lastName: "Smith", email: "test1@test.com" }
  ];

const nameSet = new Set(arr2.map(hashName)); // References arr2

let arr3 = arr1.filter(a => nameSet.has(hashName(a))); // Filter items in arr1

console.log(arr3);
.as-console-wrapper { top: 0; max-height: 100% !important; }

Alternatively, you can just use JSON.stringify. This would be a little less performant, but it's easier to maintain and understand.

const hashName = ({ firstName, lastName }) => JSON.stringify({ firstName, lastName });

const
  arr1 = [
    { number: "123", firstName: "John", lastName: "Smith", email: "test1@test.com" },
    { number: "1234", firstName: "Chad", lastName: "Baker", email: "test2@test.com" }
  ],
  arr2 = [
    { number: "12345", firstName: "Chad", lastName: "Baker", email: "test2@test.com" },
    { number: "123456", firstName: "John", lastName: "Smith", email: "test1@test.com" }
  ];

const nameSet = new Set(arr2.map(hashName)); // References arr2

let arr3 = arr1.filter(a => nameSet.has(hashName(a))); // Filter items in arr1

console.log(arr3);
.as-console-wrapper { top: 0; max-height: 100% !important; }
Mr. Polywhirl
  • 42,981
  • 12
  • 84
  • 132