-1

I would like to randomize the email addresses that are being output and remove duplicates and have them retain the original order. This works perfectly fine when I do not randomize. I generate the emails, remove dups, and output and have no issues. I also have no issues randomizing. The issue I seem to have is combining the two. Being able to generate the array, randomize, remove dups AND retain the original order. Below is what I have tried already, this is the closest I have gotten. Thanks for any help.

function randomize(arr) {
    var i, j, tmp;
    for (i = arr.length - 1; i > 0; i--) {
        j = Math.floor(Math.random() * (i + 1));
        tmp = arr[i];
        arr[i] = arr[j];
        arr[j] = tmp;
    }
    return arr;
}
const sourceArray = [];

var arr = sourceArray;

// we start with an empty source array
// const sourceArray = [];

// the number of emails / 2
const numberOfEmails = 100000;

// first pass we add 100,000 emails
for (let index = 0; index < numberOfEmails; index++) {
  sourceArray.push(`test${index}@google.com`);
}

// second pass we create dupes for all of them
for (let index = 0; index < numberOfEmails; index++) {
  sourceArray.push(`test${index}@google.com`);
}

// throw in some extra dupes for fun
sourceArray.push(`test0@google.com`);
sourceArray.push(`test0@google.com`);
sourceArray.push(`test0@google.com`);
sourceArray.push(`test0@google.com`);
sourceArray.push(`test0@google.com`);
sourceArray.push(`test0@google.com`);
sourceArray.push(`test0@google.com`);

// this serves as a map of all email addresses that we want to keep
const map = {};

// an exact time before we run the algorithm
const before = Date.now();

// checks if the email is in the hash map
const isInHashmap = (email: string) => {
  return map[email];
};

// iterate through all emails, check if they are in the hashmap already, if they are we ignore them, if not we add them.
sourceArray.forEach((email) => {
  if (!isInHashmap(email)) {
    map[email] = true;
  }
});

// we fetch all keys from the hashmap
const result = Object.keys(map);

arr = randomize(arr);

console.log(`Randomized here: ${sourceArray}`);

console.log(`The count after deduplicating: ${result.length}`);

// gets the time expired between starting and completing deduping
const time = Date.now() - before;

console.log(`The time taken: ${time}ms`);

console.log(result);
Jo-Anne
  • 131
  • 3
  • 16
  • Do you mean create a randomized copy? `var arr = sourceArray;` doesn't copy the array it assigns `arr` as a reference to the same array `sourceArray` points to so any changes to one will be reflected in the other. Instead you can create a shallow copy by spreading it `var arr = [...sourceArray];`. (Also, right now `randomize()` mutates the array in place, so reassignment isn't necessary) see: [Fastest way to duplicate an array in JavaScript - slice vs. 'for' loop](https://stackoverflow.com/questions/3978492/fastest-way-to-duplicate-an-array-in-javascript-slice-vs-for-loop) – pilchard May 24 '22 at 01:01
  • @pilchard A copy is fine. I need it created, randomized, de-duplicated, and output in the original order. I am not sure the best way to do it though. – Jo-Anne May 24 '22 at 01:07
  • 1
    What do you mean *'randomized and output in the original order'*? – pilchard May 24 '22 at 01:08
  • So if my array were [2,7,5,9,2,9,5,3,2,9] and that was my randomized array then it would remove the dups and output [2,7,5,9,3]. Same order. Hope that makes sense. – Jo-Anne May 24 '22 at 01:11
  • 1
    'randomized' and 'output in original order' seem contradictory to each other? Do you want a random order or the original order? – Brett East May 24 '22 at 01:11
  • Ha! Yup, instead of test1@google.com and test2@google.com I want 100,000 randomized, yes. Then dups removed, and then that randomized order kept. – Jo-Anne May 24 '22 at 01:12
  • Sorry for the extra clarification, are you saying it needs to be 100,000 of test${some_random_number}@google.com? How big is that random number meant to be? – Brett East May 24 '22 at 01:20
  • 100,000 randomized emails, heck, could be 50,000, just x number of randomized emails. Right now in my code they are in order, test1, test2, test3 etc. Then after they are generated, I duplicate them all, then remove the dups, and then output them. That all works fine. I'd just like to randomize them instead of having them in order like test1@google.com, test2@google.com, test3@google.com. – Jo-Anne May 24 '22 at 01:26
  • Okay, I think my answer below will get you there, happy to explain in more detail or tweak if it's not what you want – Brett East May 24 '22 at 01:27

1 Answers1

0

If I understand correctly, to get your random array of emails I would do the following:

const arrayOfEmails = [];
for (let i = 0; i < 100000; i++) {
  const randomInt = Math.floor(Math.random() * 100000); // random number between 0 and 999,999
  arrayOfEmails.push(`test${randomInt}@google.com`);
}

Then hopefully this helps as far as removing the dupes and keeping the order.

You could do

const array = [2,7,5,9,2,9,5,3,2,9]; // your random array
const set = new Set(array); // {2,7,5,9,3} javascript sets need unique members
const newArray = Array.from(set); // [2,7,5,9,3]

That's the easiest way I can think of.

If you didn't want to remove duplicates in a second step then you could also just write this:

const setOfEmails = new Set();
for (let i = 0; i < 100000; i++) {
  const randomInt = Math.floor(Math.random() * 100000); // random number between 0 and 999,999
  setOfEmails.add(`test${randomInt}@google.com`); // will only add if the email is unique
}
const arrayOfEmails = Array.from(setOfEmails); // this array will be unique emails
Brett East
  • 4,022
  • 2
  • 20
  • 31