93

I have some javascript code which looks like this:

var myClass = {
  ids: {}
  myFunc: function(huge_string) {
     var id = huge_string.substr(0,2);
     ids[id] = true;
  }
}

Later the function gets called with some large strings (100 MB+). I only want to save a short id which I find in each string. However, the Google Chrome's substring function (actually regex in my code) only returns a "sliced string" object, which references the original. So after a series of calls to myFunc, my chrome tab runs out of memory because the temporary huge_string objects are not able to be garbage collected.

How can I make a copy of the string id so that a reference to the huge_string is not maintained, and the huge_string can be garbage collected?

enter image description here

AffluentOwl
  • 3,337
  • 5
  • 22
  • 32
  • `"" + slice` does not seem to work, nor does `"" + slice + ""`. Trying other approaches. – AffluentOwl Jul 29 '15 at 23:34
  • 3
    *"substring function (actually regex in my code) only returns a "sliced string" object, which references the original"* - Huh? `.substr()`, `.substring()`, `.slice()`, and the relevant regex functions all return a *new* string. Is the other code that calls `myClass.myFunc()` keeping a reference to your huge string? If your real code is more complex, is it accidentally keeping the huge strings around in closures? – nnnnnn Jul 29 '15 at 23:38
  • 3
    @nnnnnn It is impossible to tell if it is "new" string *data* from JavaScript; implementations *can* share the underlying data without violating any part of ECMAScript. Firefox has half a dozen [different string implementations](https://blog.mozilla.org/ejpbruel/2012/02/06/how-strings-are-implemented-in-spidermonkey-2/) (see JSDependentString in particular) and I'm not surprised if Chrome has similar optimizations (which may be acting undesirably in some edge cases). That being said .. I would not be terribly surprised if it's a red herring. – user2864740 Jul 29 '15 at 23:48
  • @AffluentOwl What about `slice.reverse().reverse()`? If that also fails to resolve the behavior then I'm more likely to side with nnnnnn on the cause being something else. – user2864740 Jul 29 '15 at 23:48
  • 2
    Reference for readers: http://stackoverflow.com/questions/20536662/is-javascript-substring-virtual – user2864740 Jul 29 '15 at 23:57
  • 4
    This [bug report #2869](https://code.google.com/p/v8/issues/detail?id=2869) contains a work-about: `(' ' + src).slice(1)`. There is no official resolution. – user2864740 Jul 29 '15 at 23:59
  • @user2864740 that workaround works, thanks. If you write that as an answer, I'll mark it as the answer. – AffluentOwl Jul 30 '15 at 00:32
  • @AffluentOwl The double-reverse doesn't work? In any case, this would be a good place for a self-answer, with the problem and solution, IMOHO. – user2864740 Jul 30 '15 at 04:53
  • There is no reverse() function on strings in javascript. The string would have to be split into an array first. Also, I feel the concatenate with a single character method is quite efficient. http://stackoverflow.com/questions/958908/ – AffluentOwl Jul 30 '15 at 20:15
  • 1
    I ran into this while converting a script to "use strict;" where we were writing to a now read-only string literal and getting a "Cannot assign to read only property '0' of string". – Alex Dixon Aug 04 '16 at 22:42

11 Answers11

91

JavaScript's implementation of ECMAScript can vary from browser to browser, however for Chrome, many string operations (substr, slice, regex, etc.) simply retain references to the original string rather than making copies of the string. This is a known issue in Chrome (Bug #2869). To force a copy of the string, the following code works:

var string_copy = (' ' + original_string).slice(1);

This code works by appending a space to the front of the string. This concatenation results in a string copy in Chrome's implementation. Then the substring after the space can be referenced.

This problem with the solution has been recreated here: http://jsfiddle.net/ouvv4kbs/1/

WARNING: takes a long time to load, open Chrome debug console to see a progress printout.

// We would expect this program to use ~1 MB of memory, however taking
// a Heap Snapshot will show that this program uses ~100 MB of memory.
// If the processed data size is increased to ~1 GB, the Chrome tab
// will crash due to running out of memory.

function randomString(length) {
  var alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
  var result = '';
  for (var i = 0; i < length; i++) {
    result +=
        alphabet[Math.round(Math.random() * (alphabet.length - 1))];
  }
  return result;
};

var substrings = [];
var extractSubstring = function(huge_string) {
  var substring = huge_string.substr(0, 100 * 1000 /* 100 KB */);
  // Uncommenting this line will force a copy of the string and allow
  // the unused memory to be garbage collected
  // substring = (' ' + substring).slice(1);
  substrings.push(substring);
};

// Process 100 MB of data, but only keep 1 MB.
for (var i =  0; i < 10; i++) {
  console.log(10 * (i + 1) + 'MB processed');
  var huge_string = randomString(10 * 1000 * 1000 /* 10 MB */);
  extractSubstring(huge_string);
}

// Do something which will keep a reference to substrings around and
// prevent it from being garbage collected.
setInterval(function() {
  var i = Math.round(Math.random() * (substrings.length - 1));
  document.body.innerHTML = substrings[i].substr(0, 10);
}, 2000);

enter image description here

AffluentOwl
  • 3,337
  • 5
  • 22
  • 32
  • var string_copy = original_string.slice(0); – Wesley Stam Nov 02 '17 at 20:00
  • @WesleyStam I think the reason why AffluentOwl's post works is that he is prepending a character to the string, which causes the string to be copied, as the slice operator doesn't actually copy the string like it should. – Joran Dox Jan 31 '18 at 12:34
  • Thanks for this - var string_copy = (' ' + original_string).slice(1); I'm copying the text from an html editor and writing it next to it, then auto-saving it in a loop. I was wondering why copying the text then changing the copy changed the original - then it occurred to me - it's a reference! – rodmclaughlin Jan 24 '21 at 07:45
  • 1
    I tried to take your code and produce a JS Benchmark with it to compare the various operations suggested here. https://jsben.ch/aYDBc It seems like this is probably the best solution. Based on the benchmarks, it seems unlikely that other solutions proposed here (interpolation/repeat/etc) actually copy across all browsers. – ProdigySim Nov 21 '21 at 17:25
  • 1
    Nice benchmark, but in my testing, the interpolation and repeat(1) approaches do not actually free up the retained memory. – AffluentOwl Dec 06 '21 at 05:10
46

not sure how to test, but does using string interpolation to create a new string variable work?

newString = `${oldString}`
Pirijan
  • 3,430
  • 3
  • 19
  • 29
  • @Qwertiy Why do you say this doesn't work? Seems to work for me. After running the command from above, I changed `oldString` and it did not change newString. In addition, `typeof` returned primitive string type for both. – 425nesp Jun 13 '20 at 00:35
  • 2
    This absolutely works and is extremely performant. Tested on a 4K length string, the average performance was about 0.004 milliseconds. Quite a few times, it was about 0.001 ms to execute. Here is the test I ran in the console: `!function () { const outputArr = []; const chars = 'ABC'; while(outputArr.length < 4000) { outputArr.push( chars[Math.floor(Math.random() * chars.length)])} const output = outputArr.join(''); console.time('interpolation'); const newVariable = \`${output}\`; console.timeEnd('interpolation'); }(); ` – Marcus Parsons Sep 08 '21 at 23:14
  • 1
    From what I see, this is not working in Chrome, in that it is not actually freeing up the retained memory. https://imgur.com/a/xAg8ORK – AffluentOwl Dec 06 '21 at 05:03
  • Requires extra check for null/undefined – Alex Povolotsky Jun 06 '22 at 09:51
  • I did some testing of string interpolation in Node 19, using `pricess.memoryUsage().heapUsed` to monitor if the string was copied or reused the original storage. There is no difference in using slice, substring, or string interpolation. Using `(" " + str).slice(1)` works. – some Oct 27 '22 at 01:50
16

I use Object.assign() method for string, object, array, etc:

const newStr = Object.assign("", myStr);
const newObj = Object.assign({}, myObj);
const newArr = Object.assign([], myArr);

Note that Object.assign only copies the keys and their properties values inside an object (one-level only). For deep cloning a nested object, refer to the following example:

let obj100 = { a:0, b:{ c:0 } };
let obj200 = JSON.parse(JSON.stringify(obj100));
obj100.a = 99; obj100.b.c = 99; // No effect on obj200
alexandroid
  • 1,469
  • 2
  • 15
  • 32
Daniel C. Deng
  • 1,573
  • 13
  • 8
  • 1
    It doesn't look like desired result: https://i.stack.imgur.com/1hsxF.png – Qwertiy Jan 31 '20 at 15:22
  • When I do `Object.assign("", "abc");`, I get an empty String object. – 425nesp Jun 13 '20 at 00:41
  • 3
    `const newStr = Object.assign("", myStr); console.log(newStr);` This will print an Array: ```[String: ''] {'0': 'H','1': 'e',...}]```. Unfortunately doesn't work for string copy. – George Mylonas Jul 01 '20 at 12:50
  • 1
    I like the look of this solution the best, but unfortunately it didn't work for me in Chrome, and I resorted to the 'hackier' looking string copy & slice solution – arhnee Jan 02 '21 at 07:46
  • same here, it didn't work for me. It must be outdated now. – Felipe Centeno Jan 19 '23 at 23:32
16

You can use:

 String.prototype.repeat(1) 

It seems to work well. Refer the MDN documentation on repeat.

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
nuttybrewer
  • 217
  • 3
  • 3
  • 2
    `var a = "hi"; var b = a.repeat(1);` works for me. I tried changing `a` and `b` stayed the same. – 425nesp Jun 13 '20 at 00:37
  • Easiest solution – binarygiant Sep 18 '20 at 18:40
  • In my testing, Chrome doesn't actually make a copy on `repeat` currently. – Patrick Linskey Dec 29 '20 at 15:25
  • From what I see, like interpolation, this is not working in Chrome, in that it is not actually freeing up the retained memory. https://imgur.com/a/uitL8Dv – AffluentOwl Dec 06 '21 at 05:08
  • In Node 19, using `a.repeat(1)` does not copy the string. – some Oct 27 '22 at 01:55
  • @425nesp That's not what it's about. In Javascript, strings are immutable. The V8 engine used in Chrome and Node uses this to save only one copy of the string in memory. If you have a large string, and create new strings from it with slice/substring and so on, the engine just creates a reference to the location in memory. Even if you create a million copies of the string, the contents are not copied, only pointers to where in memory the string is located. This is the problem: if you load a large string and only want to keep a few characters, then the now unused memory will not be freed. – some Oct 27 '22 at 02:08
7

Edit: These tests were run in Google Chrome back in September 2021 and not in NodeJS.

It's interesting to see some of the responses here. If you're not worried about legacy browser support (IE6+), skip on down to the interpolation method because it is extremely performant.

One of the most backwards compatible (back to IE6), and still very performant ways to duplicate a string by value is to split it into a new array and immediately rejoin that new array as a string:

let str = 'abc';
let copiedStr = str.split('').join('');
console.log('copiedStr', copiedStr);

Behind the scenes

What the above does is calls on JavaScript to split the string using no character as a separator, which splits each individual character into its own element in the newly created array. This means that, for a brief moment, the copiedStr variables looks like this:

['a', 'b', 'c']

Then, immediately, the copiedStr variable is rejoined using no character as a separator in between each element, which means that each element in the newly created array is pushed back into a brand new string, effectively copying the string.

At the end of the execution, copiedStr is its own variable, which outputs to the console:

abc

Performance

On average, this takes around 0.007 ms - 0.01 ms on my machine, but your mileage may vary. Tested on a string wth 4,000 characters, this method produced a max of 0.2 ms and average of about .14 ms to copy a string, so it still has a solid performance.

Who cares about Legacy support anyways?/Interpolation Method

But, if you're not worried about legacy browser support, however, the interpolation method offered in one of the answers on here, by Pirijan, is a very performant and easy to copy a string:

let str = 'abc';
let copiedStr = `${str}`;

Testing the performance of interpolation on the same 4,000 character length string, I saw an average of 0.004 ms, with a max of 0.1 ms and a min of an astonishing 0.001 ms (quite frequently).

Marcus Parsons
  • 1,714
  • 13
  • 19
  • Is there any reason to believe this is more performant than the .slice(1) approach in the marked answer to this question? Or are you just advocating for this approach because you like the syntactic sugar of it? – AffluentOwl Sep 20 '21 at 21:32
  • The split join method is a tad slower than the .slice(1) method by about 0.05 ms. I never said it was more performant than that method; I just gave another method and gave performance tests for it. But interpolation has them both beat, anyways =] – Marcus Parsons Sep 21 '21 at 00:21
  • I tested this with Node 19. Using string interpolation is optimized away, and doesn't help. Using `(" " + str).slice(1)` does work. split/join also works, but is 90% slower. (tested with 20 MB csv data which resulted in many substrings) – some Oct 27 '22 at 02:34
  • Very interesting! Thank you for that, some. My tests were not run in Node. They were run in the browser. I'll edit my answer to mention that. Thanks again! – Marcus Parsons Nov 11 '22 at 06:15
3

I was getting an issue when pushing into an array. Every entry would end up as the same string because it was referencing a value on an object that changed as I iterated over results via a .next() function. Here is what allowed me to copy the string and get unique values in my array results:

while (results.next()) {
  var locationName = String(results.name);
  myArray.push(locationName);
}
Kyle s
  • 484
  • 5
  • 14
2

using String.slice()

const str = 'The quick brown fox jumps over the lazy dog.';

// creates a new string without modifying the original string
const new_str = str.slice();

console.log( new_str );
Muhammad Adeel
  • 2,877
  • 1
  • 22
  • 18
1

In my opinion this is the cleanest and the most self-documenting solution:

const strClone = String(strOrigin);
JJ Pell
  • 846
  • 1
  • 8
  • 17
0

I typically use strCopy = new String (originalStr); Is this not recommended for some reason?

Aragorn
  • 5,021
  • 5
  • 26
  • 37
  • Try running `typeof` on that. It give you an instance of type String rather than the String primitive, which provides more functionality. That being said, running it as a function like `strCopy = String(originalStr);` might work. Ref: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/String – Levi Muniz Jun 07 '20 at 20:17
  • Also, just tested it, try doing `strCopy = String(originalStr);` then modify the original string by doing `strCopy[0] = "X"`. Both copies will be modified. – Levi Muniz Jun 07 '20 at 20:24
0

I would use string interpolation and check if undefined or empty.

`{huge_string || ''}`

Keep in mind that with this solution, you will have the following result.

'' => ''
undefined => ''
null => ''
'test => 'test'
M07
  • 1,060
  • 1
  • 14
  • 23
-1

I have run into this problem and this was how I coped with it:

let copy_string = [];
copy_string.splice(0, 0, str);

I believe this would deep copy str to copy_string.

Tamás Sengel
  • 55,884
  • 29
  • 169
  • 223
  • While technically the `str` variable would be pushed into `copy_string`, `copy_string` is an array so you'd have to finish this with something like this: `const copiedVariable = copy_string.join('')` to pull the array together back into a string. – Marcus Parsons Sep 08 '21 at 23:53