74

I'm embedding a large array in <script> tags in my HTML, like this (nothing surprising):

<script>
    var largeArray = [/* lots of stuff in here */];
</script>

In this particular example, the array has 210,000 elements. That's well below the theoretical maximum of 231 - by 4 orders of magnitude. Here's the fun part: if I save JS source for the array to a file, that file is >44 megabytes (46,573,399 bytes, to be exact).

If you want to see for yourself, you can download it from GitHub. (All the data in there is canned, so much of it is repeated. This will not be the case in production.)

Now, I'm really not concerned about serving that much data. My server gzips its responses, so it really doesn't take all that long to get the data over the wire. However, there is a really nasty tendency for the page, once loaded, to crash the browser. I'm not testing at all in IE (this is an internal tool). My primary targets are Chrome 8 and Firefox 3.6.

In Firefox, I can see a reasonably useful error in the console:

Error: script stack space quota is exhausted

In Chrome, I simply get the sad-tab page:

enter image description here

Cut to the chase, already

  • Is this really too much data for our modern, "high-performance" browsers to handle?
  • Is there anything I can do* to gracefully handle this much data?

Incidentally, I was able to get this to work (read: not crash the tab) on-and-off in Chrome. I really thought that Chrome, at least, was made of tougher stuff, but apparently I was wrong...


Edit 1

@Crayon: I wasn't looking to justify why I'd like to dump this much data into the browser at once. Short version: either I solve this one (admittedly not-that-easy) problem, or I have to solve a whole slew of other problems. I'm opting for the simpler approach for now.

@various: right now, I'm not especially looking for ways to actually reduce the number of elements in the array. I know I could implement Ajax paging or what-have-you, but that introduces its own set of problems for me in other regards.

@Phrogz: each element looks something like this:

{dateTime:new Date(1296176400000),
 terminalId:'terminal999',
 'General___BuildVersion':'10.05a_V110119_Beta',
 'SSM___ExtId':26680,
 'MD_CDMA_NETLOADER_NO_BCAST___Valid':'false',
 'MD_CDMA_NETLOADER_NO_BCAST___PngAttempt':0}

@Will: but I have a computer with a 4-core processor, 6 gigabytes of RAM, over half a terabyte of disk space ...and I'm not even asking for the browser to do this quickly - I'm just asking for it to work at all!


Edit 2

Mission accomplished!

With the spot-on suggestions from Juan as well as Guffa, I was able to get this to work! It would appear that the problem was just in parsing the source code, not actually working with it in memory.

To summarize the comment quagmire on Juan's answer: I had to split up my big array into a series of smaller ones, and then Array#concat() them, but that wasn't enough. I also had to put them into separate var statements. Like this:

var arr0 = [...];
var arr1 = [...];
var arr2 = [...];
/* ... */
var bigArray = arr0.concat(arr1, arr2, ...);

To everyone who contributed to solving this: thank you. The first round is on me!


*other than the obvious: sending less data to the browser

Community
  • 1
  • 1
Matt Ball
  • 354,903
  • 100
  • 647
  • 710
  • IMO the first step to solving the problem is giving good reason why you really need to do something like this in the first place... just sayin'... – CrayonViolent Jan 28 '11 at 22:05
  • 4
    What are these 210,000 'elements'? Integers? Multi-dimensional arrays? Objects with many named properties representing the results of a database query? – Phrogz Jan 28 '11 at 22:10
  • @MattBall download the chromium source. Compile it, run it locally and turn the debugger on. – Raynos Jan 28 '11 at 22:10
  • 1
    The obvious answer to your first question is "yes; if you are crashing the browser, it is too much data". Perhaps you mean to ask the overly-subjective question "_should_ this be too much data"? – Phrogz Jan 28 '11 at 22:12
  • 44 megs of file. In a browser. Yeah, not so much. –  Jan 28 '11 at 22:13
  • @Crayon @Phrogz @Will and others - see my edits. – Matt Ball Jan 28 '11 at 22:19
  • 4
    @MattBall Thanks (my connection prevented me from seeing your data). Your limitations are a) You must load all the data, and b) you must load it all with the page. Given these limitations, I would _try_ `var all = [ [...first 100...], [...second 100...], ... ];` and see if that loads. If that works, see if you can merge them with `concat()`. If that doesn't work...quit whining about how you think the browser _ought_ to be able to handle this and 1. file bugs with the browsers and 2. change your approach. – Phrogz Jan 28 '11 at 22:26
  • 4
    @Phrogz: "quit whining" - that's good advice right there :) – Matt Ball Jan 28 '11 at 22:31
  • 2
    @MattBall I'm glad you took it in the spirit it was intended. I forgot the smiley at the end. :) – Phrogz Jan 28 '11 at 22:36
  • 1
    @Matt Ball: First round is on you, huh? I'll remember that when I go visit my family in Boston. – Ruan Mendes Mar 10 '11 at 20:21
  • Whatch ReactJs conf video, they handle gigs of data in browser. 44Mb is like nothing in comparison: https://www.youtube.com/watch?v=2ii1lEkIv1s&index=15&list=PLb0IAmt7-GS1cbw4qonlQztYV1TAW0sCr – Lukas Liesis Jan 17 '16 at 20:49
  • @LukasLiesis thanks, I will. Do note this question is 5 years old at this point – a very different time for browsers. React didn't even exist at that time. – Matt Ball Jan 19 '16 at 15:53
  • @MattBall i know, but it's still useful and answer is changing over time :) – Lukas Liesis Jan 20 '16 at 22:29

6 Answers6

88

Here's what I would try: you said it's a 44MB file. That surely takes more than 44MB of memory, I'm guessing this takes much over 44MB of RAM, maybe half a gig. Could you just cut down the data until the browser doesn't crash and see how much memory the browser uses?

Even apps that run only on the server would be well served to not read a 44MB file and keep it in memory. Having said all that, I believe the browser should be able to handle it, so let me run some tests.

(Using Windows 7, 4GB of memory)

First Test I cut the array in half and there were no problems, uses 80MB, no crash

Second Test I split the array into two separate arrays, but still contains all the data, uses 160Mb, no crash

Third Test Since Firefox said it ran out of stack, the problem is probably that it can't parse the array at once. I created two separate arrays, arr1, arr2 then did arr3 = arr1.concat(arr2); It ran fine and uses only slightly more memory, around 165MB.

Fourth Test I am creating 7 of those arrays (22MB each) and concatting them to test browser limits. It takes about 10 seconds for the page to finish loading. Memory goes up to 1.3GB, then it goes back down to 500MB. So yeah chrome can handle it. It just can't parse it all at once because it uses some kind of recursion as can be noticed by the console's error message.

Answer Create separate arrays (less than 20MB each) and then concat them. Each array should be on its own var statement, instead of doing multiple declarations with a single var.

I would still consider fetching only the necessary part, it may make the browser sluggish. however, if it's an internal task, this should be fine.

Last point: You're not at maximum memory levels, just max parsing levels.

Ruan Mendes
  • 90,375
  • 31
  • 153
  • 217
  • This looks quite promising. Must try ASAP. – Matt Ball Jan 31 '11 at 15:20
  • I'm about to start playing with this. Before I head down the wrong track: how exactly were you declaring and concatenating the arrays? **(1)** `var arr1 = [...], arr2 = [...], arr3 = [...]. bigArr = arr1.concat(arr2).concat(arr3);` or **(2)** `bigArr = arr1.concat(arr2, arr3);` or **(3)** `var bigArr = [...].concat([...], [...]);`? Or do I need separate `var` statements for each sub-array, or does none of this matter? – Matt Ball Jan 31 '11 at 20:24
  • Here's what I would try: `var arr7 = arr0.concat(arr1,arr2,arr3,arr4,arr5,arr6)`. I don't think separate var statements will make any difference. – Ruan Mendes Jan 31 '11 at 21:11
  • Test 1: splitting the array into 4 subarrays (2^15 elts each) and using `bigArr= arr0.concat(arr1).concat(arr2).concat(arr3)` still made Chrome barf. Let me try what you just suggested... – Matt Ball Jan 31 '11 at 21:39
  • I would pass all the arrays into concat() instead of calling concat multiple times. I tried your data, splitting it into two arrays worked. – Ruan Mendes Jan 31 '11 at 21:56
  • Awesome, awesome, awesome. I can't believe this actually worked, thank you. The final fix was to declare each sub-array its own `var` statement, [as Guffa suggested](http://stackoverflow.com/questions/4833480/is-this-asking-too-much-of-a-browser/4833631#4833631) (see his comment). – Matt Ball Jan 31 '11 at 22:55
  • use web worker for heavy tasks and you will have no UI lag. – Lukas Liesis Jun 01 '16 at 13:52
13

Yes, it's too much to ask of a browser.

That amount of data would be managable if it already was data, but it isn't data yet. Consider that the browser has to parse that huge block of source code while checking that the syntax adds up for it all. Once parsed into valid code, the code has to run to produce the actual array.

So, all of the data will exist in (at least) two or three versions at once, each with a certain amount of overhead. As the array literal is a single statement, each step will have to include all of the data.

Dividing the data into several smaller arrays would possibly make it easier on the browser.

Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • With your last sentence, do you literally mean doing something like `var chunk1 = [/* first 21k elts */], chunk2 = [/* next 21k elts */], ... chunk10 = [/* last 21k elts*/];`? – Matt Ball Jan 28 '11 at 22:28
  • 1
    @Matt Ball: Yes. I would even put them in separate statements, i.e. `var chunk1 = [...]; var chunk2 = [...]; ...`. – Guffa Jan 28 '11 at 22:59
  • Yes break them up. but you can concat them together and the browser is ok. The problem is trying to parse it all at once, it runs out of stack. See my answer – Ruan Mendes Jan 28 '11 at 23:00
  • 7
    Which means it's not too much to ask, if you ask politely :) – Ruan Mendes Jan 28 '11 at 23:46
6

Do you really need all the data? can't you stream just the data currently needed using AJAX? Similar to Google Maps - you can't fit all the map data into browser's memory, they display just the part you are currently seeing.

Remember that 40 megs of hard data can be inflated to much more in browser's internal representation. For example the JS interpreter may use hashtable to implement the array, which would add additional memory overhead. Also, I expect that the browsers stores both source code and the JS memory, that alone doubles the amount of data.

JS is designed to provide client-side UI interaction, not handle loads of data.

EDIT:

Btw, do you really think users will like downloading 40 megabytes worth of code? There are still many users with less than broadband internet access. And execution of the script will be suspended until all the data is downloaded.

EDIT2:

I had a look at the data. That array will definitely be represented as hashtable. Also many of the items are objects, which will require reference tracking...that is additional memory.

I guess the performance would be better if it was simple vector of primitive data.

EDIT3: The data could certainly be simplified. The bulk of it are repeating strings, which could be encoded in some way as integers or something. Also, my Opera is having trouble just displaying the text, let alone interpreting it.

EDIT4: Forget the DateTime objects! Use unix era timestamps or strings, but not objects!

EDIT5: Your processor doesn't matter because JS is single-threaded. And your RAM doesn't matter either, most browsers are 32bit, so they can't use much of that memory.

EDIT6: Try changing the array indices to sequential integers (0, 1, 2, 3...). That might make the browser use more efficient array data structure. You can use constants to access the array items efficiently. This is going to cut down the array size by huge chunk.

Matěj Zábský
  • 16,909
  • 15
  • 69
  • 114
  • +1 for Edit4; bring the data over in as compressed a format you can, and then client-side expand it into objects. – Phrogz Jan 28 '11 at 22:27
  • ...And don't expand it all at once, just few records at time. Though I'm not sure how good browsers' memory managment is though, so that might not help. – Matěj Zábský Jan 28 '11 at 22:30
  • Playing around with the Dates - changing to timestamps (milliseconds since the epoch) - didn't help. Maybe it alleviates a small part of the problem, though, and would be useful in concert with other tweaks. – Matt Ball Jan 28 '11 at 22:41
  • Even with broad-band, I don't want to load and keep about 40MB (raw!) of junk :P I wonder what the *real* memory usage is. –  Jan 28 '11 at 22:43
  • @Matt: I wonder if the fact that eliminating Date objects didn't make much difference is evidence that @Guffa's answer may have merit. – user113716 Jan 28 '11 at 22:44
  • @Matt: So would I, but TextMate has spent the last 10 minutes trying to open your file that I downloaded. Wait... I see a scrollbar! Maybe it's close. – user113716 Jan 28 '11 at 22:48
  • @patrick: yeah, I tried opening it in Notepad++. Bad idea. But guess what - [there's an SO question for that](http://stackoverflow.com/questions/159521/)! 010 Editor worked for me. – Matt Ball Jan 28 '11 at 22:50
5

Try retrieving the data with Ajax as an JSON page. I don't know the exact size but I've been able to pull large amounts of data into Google Chrome that way.

Bitsplitter
  • 980
  • 7
  • 17
  • Like I said, though maybe not clearly enough, I'm not looking for suggestions about how to dump less data into the page at once. – Matt Ball Jan 28 '11 at 22:07
  • 1
    I'm not suggesting to load less data, I'm suggesting to load it in a different way. Not as part of your script but as a separate JSON document. It depends on how the browser implements JSON parsing but Chrome at least does that without 'eval'-ing script, it parses JSON more efficient. – Bitsplitter Jan 28 '11 at 22:14
  • So basically, load the same exact amount of information, but using Ajax rather than with the rest of the page? This may be worth a shot. – Matt Ball Jan 28 '11 at 23:26
3

Use lazy loading. Have pointers to the data and get it when the user asks.

This technique is used in various places to manage millions of records of data.

[Edit]

I found what I was looking for. Virtual scrolling in the jqgrid. That's 500k records being lazy loaded.

Raynos
  • 166,823
  • 56
  • 351
  • 396
  • Would you mind expanding on this? Are you referring to lazy loading using Ajax? – Matt Ball Jan 28 '11 at 22:10
  • 1
    @MattBall there's an important distinction between paging data and loading it piece by piece and doing predictive lazy loading where you sneakly load in the background before a user needs the data and pretend that the user has no loading time but it's actaully just unobtrusive. – Raynos Jan 28 '11 at 22:34
2

I would try having it as one big string with separator between each "item" then use split, something like:

var largeString = "item1,item2,.......";
var largeArray = largeString.split(",");

Hopefully string won't exhaust the stack so fast.

Edit: in order to test it I created dummy array with 200,000 simple items (each item one number) and Chrome loaded it within an instant. 2,000,000 items? Couple of seconds but no crash. 6,000,000 items array (50 MB file) made Chrome load for about 10 seconds but still, no crash in either ways.

So this leads me to believe the problem is not with the array itself but rather it's contents.. optimize the contents to simple items then parse them "on the fly" and it should work.

Shadow The GPT Wizard
  • 66,030
  • 26
  • 140
  • 208
  • 3
    How do you think that would help? I would end up with the exact same end result, only with _more_ processing time to get there. – Matt Ball Jan 28 '11 at 22:09
  • 1
    My theory is that JS can handle big strings and having the array created at "runtime" rather than predefined will take more CPU but less stack memory. Only theory I'm afraid but IMO something worth trying. – Shadow The GPT Wizard Jan 28 '11 at 22:19
  • 2
    Your theory doesn't make sense to me – Ruan Mendes Jan 28 '11 at 22:27
  • @Juan I hope to put this into test soon and either prove or disprove my theory. :) – Shadow The GPT Wizard Jan 28 '11 at 22:30
  • @MattBall it makes perfect sense that a browser can handle a string literal far better since it doesn't require anywhere near as much parsing as an object. – Raynos Jan 28 '11 at 22:36
  • 3
    @Shadow Wizard, see my answer. After blasting your approach, it may actually work, since string parsing may not use a recursive (stack intensive) approach and parsing a json array apparently uses recursion (FF's error message is about the stack size). I feel humbled now. – Ruan Mendes Jan 28 '11 at 23:09
  • @Shadow: though your ideas make sense, the OP needs objects inside the array, not strings. – Ruan Mendes Jan 28 '11 at 23:30
  • @Juan Mendes "lazy loading" on prepared data (indexed, heap, etc) –  Jan 29 '11 at 02:21
  • @Juan fair enough and good points.. for what it's worth I believe your answer *is* the best approach. BTW you can have only @Sha no need for the full name. :) – Shadow The GPT Wizard Jan 29 '11 at 06:03