2

I would like to know if it is worth to convert an array into a set in order to search using NodeJS.

My use case is that this search is done lot of times, but not necessary on big sets of data (can go up to ~2000 items in the array from time to time).

Looking for a specific id in a list.

Which approach is better :

const isPresent = (myArray, id) => {
   return Boolean(myArray.some((arrayElement) => arrayElement.id === id);
}

or

const mySet = new Set(myArray)
const isPresent = (mySet, id) => {
   return mySet.has(id);
}

I know that theoretically the second approach is better as it is O(1) and O(n) for the first approach. But can the instantiation of the set offset the gain on small arrays?

Jaspreet kaur
  • 630
  • 5
  • 19
  • 2
    If search is too frequent than definitely IMO set will be better than some, also you don't need to change output of `some` to Boolean explicitly it will always be a `boolean` value only by default – Code Maniac Aug 27 '19 at 10:23
  • You can also look into hashMap. Search over an array requires loop where as search over `Set` or `HashMap` requires lookup. For an array of `n` items, worse case is `n` iteration but for `Set`/ `HashMap`, its `1`. You can refer: https://stackoverflow.com/questions/17295056/array-vs-object-efficiency-in-javascript – Rajesh Aug 27 '19 at 10:33
  • @CodeManiac thank you for your answer! Just to be clear, here, there are a lot of search done but each time on a different array. Causing, for each search, the instantiation of a new set/hashmap. I might try to release this diff and tells if I see any improvement of perf. – Pierre-Louis Lacorte Aug 27 '19 at 10:46
  • 1
    If you are constructing a new `Set` each time than it is definitely not better. `Set` construction walk through the whole array and then does a bit more but array search walks through a smaller part of the array. – gcasar Aug 27 '19 at 10:51
  • @Pierre-LouisLacorte well if you have new array every time you search than no sense in making a `Set` use `binary search or any other suitable algo`, but if the array changes are something let's say by adding or deleting one or more id in the previous array, than you can use `Set` on initial array and for adding a removing id you can use that `Set` itself and search values when needed – Code Maniac Aug 27 '19 at 11:37

2 Answers2

2

@jonrsharpe - particularly for your case, I found that converting an array of 2k to Set itself is taking ~1.15ms. No doubt searching Set is faster than an Array but in your case, this additional conversion can be little costly.

You can run below code in your browser console to check. new Set(arr) is taking almost ~1.2ms

var  arr = [], set = new Set(), n = 2000;
for (let i = 0; i < n; i++) {
  arr.push(i);
};

console.time('Set'); 
set = new Set(arr);
console.timeEnd('Set');

Adding element in the Set is always costly. Below code shows the time required to insert an item in array/set. Which shows Array insertion is faster than Set.

var  arr = [], set = new Set(), n = 2000;
console.time('Array'); 
for (let i = 0; i < n; i++) {
  arr.push(i);  
};
console.timeEnd('Array');

console.time('Set'); 
for (let i = 0; i < n; i++) {
  set.add(i);  
};
console.timeEnd('Set');

I run the following code to analyze the speed of locating an element in the array and set. Found that set is 8-10 time faster than the array.

You can copy-paste this code in your browser to analyze further

var arr = [], set = new Set(), n = 100000;
for (let i = 0; i < n; i++) {
  arr.push(i);
  set.add(i);
}

var result;
console.time('Array'); 
result = arr.indexOf(12313) !== -1; 
console.timeEnd('Array');
console.time('Set'); 
result = set.has(12313); 
console.timeEnd('Set');

So for your case array.some is better!

A. Todkar
  • 353
  • 2
  • 10
0

I will offer a different upside for using Set: your code is now more semantic, easier to know what it does.

Other than that this post has a nice comparison - Javascript Set vs. Array performance but make your own measurements if you really feel that this is your bottleneck. Don't optimise things that are not your bottleneck!

My own heuristic is a isPresent-like utility for nicer code but if the check is done in a loop I always construct a Set before.

gcasar
  • 729
  • 1
  • 9
  • 22