What is the fastest way to remove the duplicate to ensure the UserId is unique? There is around 30 millions userId to checks.
Usage
const userIds = {}
const transform = csv.format({ headers: false }).transform((row) => {
if (userIds[row.user_id]) {
console.log(`Found Duplicate ${row.user_id}`);
return false;
} else {
userIds[row.user_id] = 1
}
return row;
});
The problem is the script hangs after about 20 minutes. I am running script from CLI.