3

I have a .txt file that is space-delimited and it contains dupes. I want to remove the dupes but not finding it an easy task.

The file contains: orange orange apple apple pear

At first, I was getting an error with the txt extension. I updated the main to contain

const fs = require('fs');
require.extensions['.txt'] = function (module, filename) {
module.exports = fs.readFileSync(filename, 'utf8');

That helped with the errors and I was able to create a const after that.

const fruitList = require('../support/fruitList.txt');

However, I am still unable to remove dupes. I tried neek and that was not working either.

Laser Hawk
  • 1,988
  • 2
  • 23
  • 29

3 Answers3

9

You can use a set to remove duplicates in your set.

let fruitList = ["orange", "orange", "apple", "apple", "pear"];
let fruitSet = new Set(fruitList); // {"orange", "apple", "pear"}
//convert back to array
const newArray = [...fruitSet];//["orange", "apple", "pear"]
Juan
  • 477
  • 4
  • 8
4

An important thing is try to catch any errors thrown by readFileSync to find the source of your problem as to why your file isn't being read. Depending on how your data is formatted you'll usually want to catch all delimiters like tabs, spaces and newlines. The code below uses a regex in split to do that and put all your values in an array. Then the following line uses index to chuck out duplicates. try this:

const fs = require('fs')

try {
    let data = fs.readFileSync('test.txt', 'utf8')

    // split data by tabs, newlines and spaces
    data = data.toString().split(/[\n \t ' ']/)

    // this will remove duplicates from the array
    const result = data.filter((item, pos) => data.indexOf(item) === pos)

    console.log(result)

} catch (e) {
    console.log('Error:', e.stack)
}

Set to spread is a considerably faster method than filter to extract duplicates as shown in Juan's answer:

let data = 'orange orange apple apple pear orange orange apple apple pear orange orange apple apple pear orange orange apple apple pear orange orange apple apple pear orange orange apple apple pear orange orange apple apple pear'

data = data.toString().split(/[\n \t ' ']/)

console.time('method1')
const firstArr = data.filter((item, pos, arr) => arr.indexOf(item) === pos)

console.timeEnd('method1')

console.time('method2')
const secondArr = [...new Set(data)]

console.timeEnd('method2')

console.log('method1', firstArr, 'method2', secondArr)
Emmanuel N K
  • 8,710
  • 1
  • 31
  • 37
3

You can do it in a single line:

const fruitList = [...new Set(require('../support/fruitList.txt'))];

See thorough discussion in this question