0

It goes something like this where I have a london array containing more than 10 million data

london = ['dwig7xmW','gIzbnHNI' ...]

And now I have a userTraveled which also contains millions of data

userTraveled = ['ntuJV09a' ...] 

Now what's the most efficient way to split userTraveled into inLondon and notInLondon.

My attempt.

inLondon = []
notInLondon = []

userTraveled.forEach((p) => london.includes(p) ? inLondon.push(p) : notInLondon.push(p))
Ibrahim Ali
  • 2,083
  • 2
  • 15
  • 36
  • Are you going to put that data in memory? If you have it in a file, better to stream it to avoid to store in memory. But no idea how to do it in Javascript. – angelcervera Aug 11 '22 at 08:29
  • You know you might be pushing used passwords in unique array right? You only check for each raw password, doesn't mean you don't have multiple times the same raw password – aymcg31 Aug 11 '22 at 08:29
  • _"what's the most efficient way"_ - This is opinion based. – icecub Aug 11 '22 at 08:30
  • 5
    Why do you have a `passwords` array with 10 million entries in plain text? Please tell me that's just test data, and not something used in production. – Emil Karlsson Aug 11 '22 at 08:30
  • 2
    I have a whole lot of bad feelings about why this needs to be accomplished, either about whether sensitive data is being treated in the wrong way, either whether some bad design decisions are being taken. In any case, it is most likely more efficient to store in lookup maps to have a O(1) lookup complexity to optimize every lookup, although it would have a pretty heavy memory usage. – briosheje Aug 11 '22 at 08:33
  • 1
    @EmilKarlsson I'm not working with passwords in any sort of way. My work is kinda complex and for the sake of simplicity, I have used passwords as an example. – Ibrahim Ali Aug 11 '22 at 08:38
  • In that case it was unfortunate that you chose passwords as an example, since passwords usually need to be handled very differently from other kinds of data (for security reasons). – Emil Karlsson Aug 11 '22 at 08:49

4 Answers4

2

london.includes(p) will do a linear search over the array. Doing that for every userTraveled is horribly inefficient. Use a Set instead:

const usersInLondon = [], usersNotInLondon = [];
const lookup = new Set(london);

for (const p of usersTraveled) {
  (lookup.has(p) ? usersInLondon : usersNotInLondon).push(p);
}
Bergi
  • 630,263
  • 148
  • 957
  • 1,375
1

I can offer a O(n*log(n)) solution instead of your O(n^2), first order the passwords and later use the binary search on it instead of the include to search for an item

Hope it helps =)

const london = ['dwig7xmW','gIzbnHNI']
const userTraveled = ['ntuJV09a', 'dwig7xmW']

let inLondon = []
let notInLondon = []

const sortedlondon=london.sort();
userTraveled.forEach((p) => (binarySearch(sortedlondon,p)!=-1 ? inLondon.push(p) : notInLondon.push(p)))

//https://www.htmlgoodies.com/javascript/how-to-search-a-javascript-string-array-using-a-binary-search/
function binarySearch(items, value){
    var startIndex  = 0,
        stopIndex   = items.length - 1,
        middle      = Math.floor((stopIndex + startIndex)/2);

    while(items[middle] != value && startIndex < stopIndex){

        //adjust search area
        if (value < items[middle]){
            stopIndex = middle - 1;
        } else if (value > items[middle]){
            startIndex = middle + 1;
        }

        //recalculate middle
        middle = Math.floor((stopIndex + startIndex)/2);
    }

    //make sure it's the right value
    return (items[middle] != value) ? -1 : middle;
}
Edoldin
  • 160
  • 6
  • This is not an answer. Provide the solution in the answer or don't mention that at all, this is not an offering site. – briosheje Aug 11 '22 at 08:37
  • 1
    Do you mean a copy-pastable solution? I think that "first order the passwords and later use the binary search on it instead of the include to search for an item" it's a solution – Edoldin Aug 11 '22 at 08:49
  • Yes, a copy-pastable solution is necessary for it to be considered an answer. Otherwise it is just a comment. – Emil Karlsson Aug 11 '22 at 08:54
  • Some code would be nice, doing binary search manually is non-obvious. – Adder Aug 11 '22 at 08:54
  • @EmilKarlsson [No, absolutely not](https://meta.stackoverflow.com/q/267256/1048572). Good answers do explanation that leads to understanding, not just provide copy pasta for the OP. – Bergi Aug 11 '22 at 09:01
  • @Edoldin the code you provided now imho makes the answer complete, thanks :) What I meant is that since the solution is pretty much code-related, providing at least a code sample makes it easier for the OP to understand the approach he should take. – briosheje Aug 11 '22 at 09:06
  • @briosheje Except no, because it doesn't even work. The code copy-pasted from probably the first google hit returns an index, whereas the `binarySearch(sortedlondon,p) ? inLondon.push(p) : notInLondon.push(p)` expects it to return a boolean. I don't think this improved the answer. – Bergi Aug 11 '22 at 09:08
  • 1
    @Bergi should be changed by handling -1 or any record found, but I'm personally against providing zero-code answers where at least some pseudocode can be provided like in this case. We don't need to agree on that though. – briosheje Aug 11 '22 at 09:11
0

I hope you are not using these data in a wrong way.

const passwords = ['a', 'b']
const rawPasswords = ['c', 'b'];
const setPasswords = new Set(passwords)

const uniquePassword = [];
const usedPassword = [];

rawPasswords.forEach(rp => {
    if (setPasswords.has(rp)) {
    usedPassword.push(rp)
  } else {
    uniquePassword.push(rp)
  }
})

console.log(uniquePassword, usedPassword)
Hedi Zitouni
  • 179
  • 1
  • 8
-1

Referring to this answer for performance tests: Get all unique values in a JavaScript array (remove duplicates) the best solution in your case would be to use an Object. Since you require to know about the duplicates and not just remove them.

function uniqueArray( ar ) {
  var j = {};
  var k = [];
  var unique;

  ar.forEach( function(v) {
    if(j.hasOwnProperty(v)){
        k.push(v);
    } else {
        j[v] = v;
    }
  });

  unique = Object.keys(j).map(function(v){
    return j[v];
  });
  
  return [unique, k];
}

var arr = [1, 1, 2, 3, 4, 5, 4, 3];

console.log(uniqueArray(arr));

First it loops through the input array and checks if the value is already existing as a key on the object. If that's not the case, it adds it. If it is, it pushes the value to another array. Since objects use a hash, the Javascript engine can work faster with it.

Secondly it goes through the object's keys to turn it back into an array and finally returns both. I didn't add this explanation because the provided reference already explained it.

The result will be an array containing 2 arrays. First the array with unique values, second the array with duplicates.

icecub
  • 8,615
  • 6
  • 41
  • 70