0

I have a specific use case where some validation logic has to happen in the UI (for various business reasons[...]). The array may contain up to several tens or up to a few hundred thousand items (1-400K). The frontend is Angular based.

The first step is to check for duplicates (and store them in another array[...]). This is accomplished with below:

validateTargets(targets: string[]): ValidationResultObject[] {

    let result: ValidationResultObject[];
    let dups: string[] = [];

    var uniques = targets.filter( (item,index) => {
        if (targets.indexOf(item) === index) {
        return targets.indexOf(item) === index
        }
        else {
            dups.push(targets[index])
        }
    }

    //other validation logic goes here

    return result;
}

Problem is an obvious UI freeze when this runs against anything above 50K. For the time being I've put above as callback in another function in a setTimeout to at least allow the UI to run a spinner while the page hangs :)

I've seen several ways people advise how to design code to allow UI to be responsive (or at least re-draw); however, my case is a bit tricky since I deal with duplicates.

I was thinking to break down the array to chunks and run above Array.filter part in a loop within a setTimeout (for UI) BUT I later need to compare the chunks against themselves anyways so it just prolongs the logic! I do not feel comfortable enough to experiment with workers as there are browsers in the organization that do not support those.

Does anyone have an idea how to work this out? No, it is not possible to move this to backend :(

Regards

dhex
  • 25
  • 5
  • 2
    Seems like the fundamental mistake is having 100,000 things sitting around in the UI. Your loop as it is does a tremendous amount of unnecessary work, however; doing an `indexOf` inside the `.filter` means you're doing work proportional to the *square* of the array length. – Pointy Jan 31 '20 at 17:19
  • 3
    Is it so vital that this problem be treated on a browser? – Mister Jojo Jan 31 '20 at 17:24
  • @Pointy unfortunately this is something that I inherited and virtually no chance to rewrite/move some of the stuff to backend. – dhex Jan 31 '20 at 17:40
  • @MisterJojo see above. My hands are tied. However you are both correct and I would push it to back if only had the chance. – dhex Jan 31 '20 at 17:41
  • 1
    I especially think that posting hundreds of thousands of articles on a browser page is unproductive and unhealthy – Mister Jojo Jan 31 '20 at 17:47

3 Answers3

2

You can filter out duplicates much, much more efficiently:

let filtered = targets.reduce((result, item) => {
  result[item] = 1;
  return result;
}, {});
let noDuplicates = Object.keys(filtered);

That makes one pass over the array, and leverages the internal efficiency of property name lookup over the sequential search of .indexOf(). For an initial array with the extremely large number of elements you have, this should run in a comparably tiny amount of time.

Pointy
  • 405,095
  • 59
  • 585
  • 614
1

You could use an asynchronous function to process this amount of data. When it's finished call a callback with the result as an argument to continue the normal flow after it is finished.

async validateTargets(targets: string[], callback: function): {
 //...Logic
 callback(result)
}

Also, to remove duplicates you could use

[...new Set(items)]

Note: This will only work if the Items array contains only primitive values

Lucas Fabre
  • 1,806
  • 2
  • 11
  • 25
0

I have never faced this situation but I would want to ask if you used Set type in javascript as it removes duplicates internally and is efficient than filter JS Set vs Array and if you have browsers that would not support Set you can still use a polyfill.

pavan kumar
  • 823
  • 6
  • 15