Wanted to share a simple experiment I ran, using node.js v6.11.0 under Win 10.
Goal. Compare arrays vs. objects in terms of memory occupied.
Code. Each function reference
, twoArrays
, matrix
and objects
create two arrays of same size, containing random numbers. They organize the data a bit differentely.
reference
creates two arrays of fixed size and fills them with numbers.
twoArrays
fills two arrays via push (so the interpreter doesn't know the final size).
objects
creates one array via push, each element is an object containing two numbers.
matrix
creates a two-row matrix, also using push.
const SIZE = 5000000;
let s = [];
let q = [];
function rand () {return Math.floor(Math.random()*10)}
function reference (size = SIZE) {
s = new Array(size).fill(0).map(a => rand());
q = new Array(size).fill(0).map(a => rand());
}
function twoArrays (size = SIZE) {
s = [];
q = [];
let i = 0;
while (i++ < size) {
s.push(rand());
q.push(rand());
}
}
function matrix (size = SIZE) {
s = [];
let i = 0;
while (i++ < size) s.push([rand(), rand()]);
}
function objects (size = SIZE) {
s = [];
let i = 0;
while (i++ < size) s.push({s: rand(), q: rand()});
}
Result. After running each function separately in a fresh environment, and after calling global.gc()
few times, the Node.js environment was occupying the following memory sizes:
reference
: 84 MB
twoArrays
: 101 MB
objects
: 249 MB
matrix
: 365 MB
theoretical
: assuming that each number takes 8 bytes, the size should be 5*10^6*2*8 ~ 80 MB
We see, that reference
resulted in a lightest memory structure, which is kind of obvious.
twoArrays
is taking a bit more of memory. I think this is due to the fact that the arrays there are dynamic and the interpreter allocates memory in chunks, as soon as next push operation is exceeding preallocated space. Hence the final memory allocation is done for more than 5^10 numbers.
objects
is interesting. Although each object is kind of fixed, it seems that the interpreter doesn't think so, and allocates much more space for each object then necessary.
matrix
is also quite interesting - obviously, in case of explicit array definition in the code, the interpreter allocates more memory than required.
Conclusion. If your aim is a high-performance application, try to use arrays. They are also fast and have just O(1) time for random access. If the nature of your project requires objects, you can quite often simulate them with arrays as well (in case number of properties in each object is fixed).
Hope this is usefull, would like to hear what people think or maybe there are links to some more thorough experiments...