Why is this code using so much memory?

Question

I started to solve some codeeval tests and even if the speed score of the problems in good in general the memory one is very low.

Can you please help me? Why is this code using so much memory (6392809 MEMORY, BYTES)?

var fs  = require("fs");
fs.readFileSync(process.argv[2]).toString().split('\n').forEach(function (line) {
    if (line != "") {
       console.log( line.split(' ').map(function(item){
            return item.substr(-1) + item.substr(1, item.length-2) + item[0];
        }).join(' '));
    }
});

SWAP NUMBERS CHALLENGE DESCRIPTION:

Write a program that, given a sentence where each word has a single digit positive integer as a prefix and suffix, swaps the numbers while retaining the word in between. Words in the sentence are delimited from each other by a space.

INPUT SAMPLE:

The first argument is a path to a file. Each line of the input file contains one test case represented by a sentence. Each word in the sentence begins and ends with a single digit positive integer i.e. 0 through 9. Assume all characters are ASCII.

4Always0 5look8 4on9 7the2 4bright8 9side7 3of8 5life5
5Nobody5 7expects3 5the4 6Spanish4 9inquisition0

OUTPUT SAMPLE:

For each test case, print to standard output the sentence obtained by swapping the numbers surrounding each word, one per line.

0Always4 8look5 9on4 2the7 8bright4 7side9 8of3 5life5
5Nobody5 3expects7 4the5 4Spanish6 0inquisition9

CONSTRAINTS:

The suffix and the prefix of each word may be equal. Sentences are form 1 to 17 words long. The number of test cases is 40.

I want to try and improve the memory score, any tips are welcomed.

Maybe file is just large and you load the whole file into memory. — Andrey, Aug 05 '15 at 09:10

score 5 · Answer 1 · edited May 23 '17 at 11:43

5

I think this will read in the whole file at once: fs.readFileSync(process.argv[2]).toString() thus using up at least the size of the file in RAM.

Try an event-based approach.

edited May 23 '17 at 11:43

Community

1
1

answered Aug 05 '15 at 09:11

Ulrich Thomas Gabor

6,584
4
27
41

the file is 104 bytes - it's the INPUT SAMPLE – Jaromanda X Aug 05 '15 at 09:13
@JaromandaX it is sample but usually all those autotests run against larger data once you submit. I guess the high memory consumption happen on larger inputs not on the sample. – Andrey Aug 05 '15 at 09:14
Asynchronicity has nothing to do with this. You can as well load whole file in memory asynchronously or you can avoid loading it synchronously with readSync and smaller chunks. – Andrey Aug 05 '15 at 09:16
@Andrey Although the line between asynchronous and event-based is not so strict in Javascript. – Ulrich Thomas Gabor Aug 05 '15 at 09:18
1

Remember that if the file is UTF8 the size in RAM will be doubled, since every character consists of two bytes. – Mouser Aug 05 '15 at 09:23
@Mouser false. ASCII chars occupy 1 byte in UTF**8** (hence the 8 in the name) – Andrey Aug 05 '15 at 09:27
@Andrey Strings in memory are saved as UTF16. – Ismael Miguel Aug 05 '15 at 09:28
@IsmaelMiguel yes, but this has nothing to do with file encoding and UTF8. – Andrey Aug 05 '15 at 09:31
@Mouser yes, chars take 2 bytes in memory in JS. But file encoding is irrelevant. – Andrey Aug 05 '15 at 09:34
@Andrey I am not entirelly sure, but I think the strings are saved in memory as UTF16 if they are loaded in Unicode (UTF8, UTF16, UTF32). Using ASCII, ANSI, ISO-8859-1(5) or windows-1252 or similar will still save as 1 byte. But I am not entirelly sure of this. I may be wrong. I know that strings are stored in UTF16 in some situations (or all). – Ismael Miguel Aug 05 '15 at 09:34
1

@Andrey From experience I know that UTF8 saved XML files will be twice the size in memory. ANSI files I don't know. – Mouser Aug 05 '15 at 09:36
@IsmaelMiguel you should check out this http://www.joelonsoftware.com/articles/Unicode.html. In JS strings are stored in UTF-16 encoding. Where and how they come from doesn't matter. – Andrey Aug 05 '15 at 09:38
@Mouser you should really check out what is UTF8, how it works and how strings are encoded in JS (UTF-16). If you read ascii file in JS into memory it will use exactly twice memory. If you read UTF-8 file results may vary in both directions depending on content. XML has absolutely nothing to do with it. – Andrey Aug 05 '15 at 09:41
@Andrey That only explains UTF8, nothing else. – Ismael Miguel Aug 05 '15 at 09:45
1

@IsmaelMiguel everything is very simple. JS uses UTF-16 internally which almost always is 1 char = 2 bytes. For ASCII, latin1, and w1252 it is always 1 byte per char (so UTF16 doubles). UTF-8 has varying byte length per char. For chars 0-128 the ratio will be also 1-2. Later it varries. € sign takes 3 bytes in UTF-8 and 2 in UTF-16. – Andrey Aug 05 '15 at 10:03
1

Thanks for your answers guys. I will test this and get back to you with the results :) – biancamihai Aug 05 '15 at 10:35
@Andrey I was saying that the link only says what UTF8 is, and has nothing to do with Javascript. But that information is still important. – Ismael Miguel Aug 05 '15 at 11:21

score 3 · Accepted Answer · answered Aug 05 '15 at 09:27

I would bet that your problem is that you load whole file into memory. Other suggest that you read line by line in some form, but this may not help. The testers can feed gigantic file with just 1 line and you will still end up using a lot of memory. Your best approach is to read file in chunks. Use fs.readSync (no real need for async stuff here). You should process words so you should read chunk, swap numbers for words in it and go on. Of course it is possible that they put 1 gigantic word in file but this is some next level of complexity.

Thanks for your answer. I will test this and get back to you with the results. — biancamihai, Aug 05 '15 at 10:34

score 2 · Answer 3 · answered Aug 05 '15 at 09:21

Instead of all that code, I suggest to use Regular Expressions.

Doesn't matter if you have a small or huge file.

Simply remove all the code and use this:

.replace(/(\d)([^\d]+)(\d)/g,'$3$2$1');

Example:

alert("4Always0 5look8 4on9 7the2 4bright8 9side7 3of8 5life5\n\
5Nobody5 7expects3 5the4 6Spanish4 9inquisition0".replace(/(\d)([^\d]+)(\d)/g,'$3$2$1'));

That simple!

Your final code:

var fs  = require("fs");
var text = fs
    .readFileSync(process.argv[2])
    .toString() //is this really needed?
    .replace(/(\d)([^\d]+)(\d)/g,'$3$2$1');

This will reduce the memory usage, since you only touch on the string. Notice that this may load the whole file into memory, which may still show high usage.

Why is this code using so much memory?

3 Answers3