7

I have checked each and every question and article about this, that I could find, but nothing really answers it. I make it sound as if I have just one question, there's more of them.

Maybe I should first explain what I am trying to do, I am building a File Manager entirely in JavaScript, the File Manager receives files from Backend (PHP, Twig), it then stores them in an array of Folder, each Folder has its own Files, Folders is an Array, and Files in each Folder is an Array as well, these Files are also shown on the page, and the user can select, copy, cut, paste ( other operations ) them ( I am still writing these operations, because here lies the problem ).
Files on page have data-id assigned to them, so that I can easily operate on them between File Manager and the Backend, and because I know the ID of every and each file, I think I could completely eliminate traversing through Arrays of Files in search of a particular File, if I could only create an Array that would take the ID of a File as an Index, and because this is JavaScript, I can do that! but there are some problems with it.

Trying to use an Object for this task just does not work, because it is just way slower than Array ( now, I understand, that the differences in speed, even if in millions, are not that big, but why not try to dig for ultimate performance, right ? )

So here is a list of questions, that I cannot seem to find a reliable answer to:

  1. Why is Object on Chrome so much faster, than Object on Firefox ?
  2. Why is accessing an undefined index on Chrome so much slower ?
  3. Why is Array so much faster on Firefox, than on Chrome ?
  4. I know why Chrome loses performance with higher indexes ( at 100k it transforms the array into a list, I think, it was answered with a link to source in another question on SO ), but Firefox loses performance gradually as if it traveresed to higher indexes sequentially, even if accessed directly, why is that ?
  5. Why is an Array extermely slow when there is only one element in it, but that element is at a very high index ?

There are probably some more questions I have, but I cannot think of them right now.


The fastest configuration I found, that supports IDs as indexes and holes between them, as well as starting at a high index, is using an Array that holds the data you want to store, and another array, that just holds the indexes, so that when you are searching for an object with ID 10, you touch the index Array, instead of the Array with data.
You can see an example of this in http://jsperf.com/array-management-performance/2.

EDIT: If you want to see the performance degradation, please "review" this jsperf and change minId and maxId to some big numebers.


Here are some stats:

Object

Read Defined: Firefox 165.173 mln ops/sec | Chrome 351.699 mln ops/sec
Read Undefined: Firefox 98.582 mln ops/sec | Chrome 54.666 mln ops/sec
Write Close: Firefox 7.599 mln ops/sec | Chrome 291.244 mln ops/sec
Write Far: Firefox 5.599 mln ops/sec | Chrome 93.733 mln ops/sec
Write Overwrite: Firefox 7.599 mln ops/sec | Chrome 291.244 mln ops/sec

Array

Read Defined: Firefox 681.206 mln ops/sec | Chrome 401.522 mln ops/sec
Read Undefined: Firefox 681.206 mln ops/sec | Chrome 62.827 mln ops/sec
Write Close: Firefox 400.234 mln ops/sec | Chrome 121.519 mln ops/sec
Write Far: Firefox 348.560 mln ops/sec | Chrome 121.519 mln ops/sec
Write Overwrite: Firefox 400.234 mln ops/sec | Chrome 234.337 mln ops/sec

P.S. Did you know, that Read Defined is faster on Chrome on Mobile, than on Chrome on Desktop ?

I jsperf-ed every single question I have here, so either my/others' tests were incorrectly written, or this is some really funky stuff.

http://jsperf.com/array-management-performance/2
http://jsperf.com/array-in-the-middle-of-nowhere
http://jsperf.com/object-holey-performance-better
http://jsperf.com/object-performance-better
http://jsperf.com/array-vs-object-mine-v2/5

P.S.2 I know, that one of the Array tests is actually testing Object, and another Object test is actually testing Array, please do not point that out, I am aware, I wrote this, the error is because jsperf is very poorly made, and I was trying to relatively quickly check a lot of different settings.

P.S.3 I am sorry, that some of these tests are really messy, I did not think, I would actually need to show them to anyone, but I still think there are sufficient.

---------------------------------------------------------------------------

EDIT: ( Answer, hopefully to re-open the question, so that I can answer it )

Here I go with an answer to all of our (my) problems. Now that I read my questions again, while having the answer, I can see how my questions could not be answered easily, or at all really, without digging into the sources of each browser.

I myself still do not know the exact answers to 1, 2, 3, 4 or even 5, but the answer that is correct, is "Because of implementation differences", it is as simple as that, obviously, I have not returned here to write just that, I have a solution to the problem I am showing in this question.

Simplified Problem: You have a set of IDs, that are assigned to data, most likely from the DB. You want that data in an array in JS, so that You can work with it very easily, but You DO NOT WANT to lose performance, while putting 100k elements into a JS array. How does one put N elements into an array in JS without losing read/write performance ? (I did not test write, because it is not that important to me).

Solution and Explanation: While studying this problem, I have checked a mutlitude of potential solutions, like splines, fourier transformations, hash tables, binary search, and a twist on map/hash tables, that actually performed VERY well, but not well enough for my taste ( it would also perform worse with the increase in size ).

The final solution seems dead simple and obvious, but somehow ( I guess I'm dumb ) it took me 10 days to figure out, it scales perfectly, and access to any element is extremely fast O(1).

To be able to write this solution, I use this property of JS engines http://jsperf.com/array-of-arrays-with-high-indexes/2 , it would seem, that access to first 1000 elements IS SUPER FAST ( even in arrays of arrays ), and because I know my data is unsigned integers, I can easily map the integers from 1D to 3D, so that FastArray can hold up to a Billion elements.

You can see this solution in action right here: http://jsperf.com/map-check-v2/5 -- The first test is slow, because I read the sample data from a simple array, and that is very slow ... EDIT: This does not seem to be the case, I really do not know what is causing it, I think jsperf is kind of freaking out, because running just that test yields pretty good results, very weird.

A cleaner implementation is here https://github.com/ImJustAskingDude/JavascriptFastArray with a lot of references to the every page I read, to try to figure this out.

P.S.4 Moderators. please re-open this question, so that people searching for Ultimate Array Performance can see this answer in a nice actual Answer. Thanks.

P.S.5 I do not know why, I guess there is a bug in my final code, that makes Chrome perform relatively poorly, which is probably easily fixable.

user1494173
  • 169
  • 1
  • 4
  • 11
  • 2
    answers to 1,2,3 and 4. They perform differently because the underlying (JVM?) code is different. You wont get a more complete answer to questions 1-4 than that. Q5 - answer: because it is unusual to have a single element in an array at a high index, so that edge case is not optimized at the expense of more normal array usage – Jaromanda X Oct 02 '15 at 01:35
  • Re #5: that depends very much on what you're doing with the array. Looping over it from 0 to length, yes, slow, obviously, the compiler can't skip members just because they don't exist, you might be about to assign to it. However, native iterators like *forEach* and *reduce* skip missing members so not so slow on sparse arrays as loops. – RobG Oct 02 '15 at 02:48
  • 2
    *why not try to dig for ultimate performance, right ?* Because it's a diversion, and takes your attention away from writing clean, readable, writeable, maintainable, provably correct code. –  Oct 02 '15 at 04:40
  • @JaromandaX of course, I understand that the underlying code is different, but then again, arrays and objects are pretty much the same thing, as MDN points out, they are based on arrays, and if they are both based on arrays, then the underlying code should be pretty much the same right ? About accessing a high index, I cannot imagine, why would that be slower than accessing a low index, I know it is, but why ? Unless the array transforms into a list for a singular high index element, there is absolutely no reason for it to be slow, don't you think ? – user1494173 Oct 02 '15 at 10:49
  • @RobG You would be right, if it wasn't for the fact, that looping is mainly done by the programmer, at least I think it is, I write the loop, and I tell JavaScript which elements to access, so if I loop from 0, yea, it is going to loop from 0, but if I start the loop from that index, and go to that index + 10, I am pretty sure, it is going to access just those, but who knows, maybe you are right, could You conjure an example, where JavaScript would just outright ignore the programmer, and access those elements (not in a list) ( I am not being sarcastic, actually want to know ) – user1494173 Oct 02 '15 at 11:07
  • @torazaburo and another yes, You are absolutely right torazaburo, but 1. I just enclose the ugly details in a nice Object 2. I don't think that choosing the fastest method to store, access data would be considered ugly code though --- maybe I was a bit unspecific about the ultimate performance, for that I am going to use emscripten, but for now, I just want the fastest configuration, or a trick, that could preserve read and writes in at least 100 mln ops/sec – user1494173 Oct 02 '15 at 11:10
  • @user1494173 Objects based on arrays? Not necessarily at all: http://stackoverflow.com/questions/5288320/why-is-stdmap-implemented-as-a-red-black-tree. This kind of underlying implementation detail in the browser's js engine can have huge impacts like what you're seeing in your tests. – Jared Smith Oct 02 '15 at 14:34
  • I really don't understand the question or responses to comments. The statement "*Why is an Array extermely slow*" makes no sense without also answering "slow at what?". I assumed you meant in regard to looping over the members, but perhaps not. So voting to close as the actual question is not well defined. – RobG Oct 03 '15 at 03:57
  • @RobG I am really tired, and want to go to sleep, but I am still going to answer your ridiculous comment right now. Ask yourself, what can a person do with an array ? I think they can either read from it, or write to it, those are THE ONLY actions you can do on an array, looping over it, means either writing to it, or reading from it, saying, that an array is extremely slow means that read/writes from/to it are slow, it is pretty simple, I thought --- but just for you, I am asking about writes and reads. – user1494173 Oct 03 '15 at 04:04
  • What do "close" and "far" mean? –  Oct 03 '15 at 05:44
  • @torazaburo Yeah, I'm sorry, should have explained - close means close to data, far means far from data, if you had var a = [1, 2, 3], close would be a[3] = 4;, far would be a[1003] = 1004; Also, the example in jsperf array-managment-performance, please edit it and change the minID and maxID at the top, to some huge values, to be able to see the difference in performance yourself. – user1494173 Oct 04 '15 at 13:37
  • @user1494173—there are many operations that can be performed on an array other than reading and writing using built–in methods or emulated equivalents. These can be emulated on other types of Object, some of which have their own equivalent built–in methods (e.g. Strings have slice, many DOM objects have length). So when you say an array is "slow", it's reasonable to ask "in comparison to what" or "at what kind of operation". The jsperf stuff has no comments, no explanation of what it's for or trying to prove, or even how you came to the conclusion the results are "slow". – RobG Oct 05 '15 at 00:53
  • @user1494173—e.g. your paragraph starting "*The fastest configuration I found…*' seems to indicate you've discovered an indexed search is faster than an incremental search, especially on sparse arrays. Congratulations, but that's nothing to do with the underlying performance of read/write operations and much more to do with doing as few operations (of any kind) as possible. – RobG Oct 05 '15 at 00:59
  • Didn't realize the question got closed, cause no one is able to answer my question, that's fine, I'm working on it myself, and will post an answer when I gather enough information. Now then, @RobG Your comment is my answer, You wrote "using built-in METHODS", these are not operations, they are methods, methods rely on operations, operations are basic, methods build upon them, let us take array.prototype.concat as an example, it could read from the first array and write into another array A, then read from the second array and write into A. Operations != Methods,. – user1494173 Oct 11 '15 at 21:48

1 Answers1

0

First of all, JS engines are different on every browser, so they don't perform equally on every case.

Also, you have to take in mind that arrays in JavaScript are Objects too, the only difference is that they have integers as their key values to simulate the indexes.

When you create an array with just one element at a high index the Object will be created with all the other "spaces" filled with undefined values so the JS engine have to traverse the whole structure to find your value.

As @RobG says, the native array methods like reduce, map, and forEach will perform a lot better in this case.

Also you can use functional libraries like Ramda or Lodash to help you traverse and flatten your queries results.

David Gomez
  • 2,762
  • 2
  • 18
  • 28
  • 1
    *the Object will be created with all the other "spaces" filled with undefined values* I do not believe this is correct. –  Oct 02 '15 at 06:16
  • @torazaburo is absolutely right, there even is another question on SO, that debunks this belief, You can even check it yourself, by creating an array and console.log -ing it out, for one element, there is just going to be one element and no more than that, but if you create two elements, that are far apart, the "elements" in between are going to be marked by console.log. I don't think they will actually be filled with anything, of course accessing them yields "undefined", but that's just accessing any variable with no assigned value. – user1494173 Oct 02 '15 at 11:16
  • I looked at Ramda and Lodash, but can't seem to understand how would they help me for a thing as relatively simple as this, I just want to have an array that performs reads and writes amazingly fast, I almost never have to traverse this array because of that (having the index, means I can just access elements directly). Maybe I should put this in my question ? – user1494173 Oct 02 '15 at 11:25
  • You don't mean JVM. JVM stands for Java Virtual Machine. This post is about javascript not Java. You mean JavaScript engine not JVM. Browsers do not use a JVM to execute JavaScript. – bhspencer Oct 02 '15 at 14:39
  • @user1494173 they will make your life easier on terms of data manipulation. BTW, why don't you provide a code example of your data so maybe we can play with it and see how we can help you improve the performance? – David Gomez Oct 03 '15 at 01:49