2

For some reason i end up getting \r appended to the end of my array elements. This happens after reading data from a file and been splitted with with \r.The file reading is been achieved with node's file system (fs) module.

data.txt

Samuel  20  male
Benjamin    10  male
Fortune 20  female

The code is shown below :

const fs = require('fs');

let data = fs.readFileSync('data.txt', 'utf8' );

let rawData = data => { return data.split( '\n' ) };

let objData = data => { return data.map( data => { return data.split( '\t' ) } ) }

console.log( objData( rawData( data ) ) );

Code output :

$ node reduce_example.js

[ [ 'Samuel', '20', 'male\r' ],
[ 'Benjamin', '10', 'male\r' ],
[ 'Fortune', '20', 'female' ] ]

I ran the code on node v9.5.0, v9.0.0 and v8.0.0

wokoro douye samuel
  • 2,194
  • 3
  • 15
  • 29

4 Answers4

4

The text file uses the Windows style of line endings, which is "\r\n", as apposed to the Unix style of line endings, "\n" (and apposed to the old Mac style of "\r"). You can read more about this difference and how it came about here - What is the difference between \r and \n?

To account for this difference, change this line:

let rawData = data => { return data.split( '\n' ) };

to this:

let rawData = data => { return data.split( '\r\n' ) };

However, you may be asking yourself, "What happens if I run this code on a file with Unix style endings?" That's a great question, and it wouldn't work. To account for both styles of line endings, you can use an optional \r with this:

let rawData = data => { return data.split(/\r?\n/) };
Matt C
  • 4,470
  • 5
  • 26
  • 44
1

You could split with an optional \r.

var data = 'Samuel\t20\tmale\r\nBenjamin\t10\tmale\r\nFortune\t20\tfemale';

let rawData = data => data.split(/\r?\n/);

let objData = data => data.map(data => data.split('\t'));

console.log(objData(rawData(data)));
Nina Scholz
  • 376,160
  • 25
  • 347
  • 392
0

Use /\r?\n/ to split on newlines. Then trim() to remove the whitespace.

  • \r?\n will match all both \r\n and \n
    • The \r is optional due to the fact that it precedes a ? meaning zero or one
    • The \n must always be present
  • Now once the split has executed, we execute map which will preform a function on each item returning a new array with the new results.
    • For the callback we call trim on each item which will remove the whitespace from the beginning and ending of each item (not in between)

let str = 'abc\r\ndefg\n\t123\n\t456\r\n789'

console.log(str.split(/\r?\n/g).map(i => i.trim()))
Get Off My Lawn
  • 34,175
  • 38
  • 176
  • 338
  • Trimming is probably preferred anyway. Unless you actually care about a line starting with whitespace. For reference, an alternative to splitting then trimming is to do the two in reverse - `str.trim().split(/\s*\n\s*/g)` - because of the `\s*` any whitespace will be removed in the beginning and ending of each line, aside from the beginning of the first and the ending of the last. – VLAZ May 12 '18 at 21:09
  • That will only trim the beginning and end of the string, so if there is whitespace in the middle it won't trim it. That is why I added it to a map – Get Off My Lawn May 12 '18 at 21:10
  • I think this answer provides a great tip about `trim()`, although it could be improved by explaining why `/\r?\n/` is correct and why `\n` is not correct. – Matt C May 12 '18 at 21:13
  • @GetOffMyLawn `"many spaces in this string".split(/\s+/g)` although now I realise `\s*` was incorrect, as it also splits on zero-length whitespace, so the result contains each letter separately. EDIT: damn, SO decided to condense the spaces in my example. You can manually put more spaces, if you wish. – VLAZ May 12 '18 at 21:30
0

An alternative is to use readline and read the file line by line instead of reading the whole file into memory & splitting on line endings. readline will handle multi OS line endings for you, and you will only need to split on tabs.

The 'line' event is emitted whenever the input stream receives an end-of-line input (\n, \r, or \r\n). This usually occurs when the user presses the <Enter>, or <Return> keys.

const readline = require('readline');
const fs = require('fs');

const output = [];

const reader = readline.createInterface({
  input: fs.createReadStream('data.txt')
});

reader.on('line', line => {
    output.push(line.split('\t'));
});

reader.on('close', () => console.log(output));
Marcos Casagrande
  • 37,983
  • 8
  • 84
  • 98