3

I was reading about data locality and want to use it to improve my game engine that I'm writing.

Let's say that I have created five objects at different times that are now all in different places in the memory not next to each other. If I add them all to an array, will that array only hold pointers to those objects and they will stay in the same place in the memory or will adding them all to an array rearrange them and make them contiguous.

I ask this because I thought that using arrays would be a good way to make them contiguous, but I don't know if an array will fix my problem!

Joza100
  • 338
  • 4
  • 16
  • An array `T[]` only holds object references. – Ousmane D. Jul 30 '18 at 16:12
  • 1
    This has a 99.99% probability to be a premature optimization. Strive to make your code readable and solid, know the time complexity of your data structures, and beware of IO. Only optimize if you have a performance problem, and you have proven that locality was the problem. For 5 objects, it won't be. – JB Nizet Jul 30 '18 at 16:17
  • @JBNizet I didn't say that i have only 5 objects there, that was just an example to demonstrate what I mean, but thanks for the answer. – Joza100 Jul 30 '18 at 16:19
  • You are creating a game engine but don't have any idea how JVM manages memory? – Bhesh Gurung Jul 30 '18 at 16:20
  • @BheshGurung just because I don't know exactly how things are stored, doesn't mean I don't know java very well -_- A helpful answer, thanks. – Joza100 Jul 30 '18 at 16:22

3 Answers3

9

tl;dr

Manipulating an array of references to objects has no effect on the objects, and has no effect on the objects’ location in memory.

Objects

An array of objects is really an array of references (pointers) to objects. A pointer is an address to another location in memory.

We speak of the array as holding objects, but that is not technically accurate. Because Java does not expose pointers themselves to us as programmers, we are generally unaware of their presence. When we access an element in the array, we are actually retrieving a pointer, but Java immediately follows that pointer to locate the object elsewhere in memory.

This automatic look-up, following the pointer to the object, makes the array of pointers feel like an array of objects. The Java programmer thinks of her array as holding her objects when in reality the objects are a hop-skip-and-a-jump away.

Arrays in Java are implemented as contiguous blocks of memory. For an array of objects, the pointers to those objects are being stored in contiguous memory. But when we access the elements, we are jumping to another location in memory to access the actual object that we want.

enter image description here

Adding elements may be “cheap” in that if memory happens to be available next door in memory, it can be allocated to the array to make room for more elements. In practice this is unlikely. Chances are a new array must be built elsewhere in memory, with all the pointers being copied over to the new array and then discarding the original array.

Such a new-array-and-copy-over is “expensive”. When feasible, we want to avoid this operation. If you know the likely maximum size of your array, specify that size when declaring the array. The entire block of contiguous memory is claimed immediately, with empty content in the array until you later assign a pointer to the elements.

Inserting into the middle of an array is also expensive. Either a new array is built and elements copied over, or all the elements after the insertion point must be moved down into their neighboring position.

None of these operations to the array affect the objects. The objects are floating around in the ether of memory. The objects know nothing of the array. Operations on the array do not affect the objects nor their position in memory. The only relationship is that if the reference held in the array is the last reference still pointing to the object, then when that array element is cleared or deleted, the object becomes a candidate for garbage-collection.

Primitives

In Java, the eight primitive types (byte, short, int, long, float, double, boolean, and char) are not objects/classes and are not Object-Oriented Programming. One advantage is that they are fast and take little memory, compared to objects.

An array of primitives hold the values within the array itself. So these values are stored next to one another, contiguous in memory. No references/pointers. No jumping around in memory.

As for adding or inserting, the same behavior discussed above applies. Except that instead of pointers being shuffled around, the actual primitive values are being shuffled around.

enter image description here

Tips

In business apps, it is generally best to use objects.

That means using the wrapper classes instead of primitives. For example, Integer instead of int. The auto-boxing facility in Java makes this easier by automatically converting between primitive values and their object wrapper.

And preferring objects means using a Collection instead of arrays, usually a List, specifically a ArrayList. Or for immutable use, a List implementation returned from the new List.of method.

In contrast to business apps, in extreme situations where speed and memory usage are paramount, such as your game engine, then make the most of arrays and primitives.

In the future, the distinction between objects and primitives may blur if the work done in Project Valhalla comes to fruition.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
  • Beautiful explanation. – the_prole Mar 22 '22 at 05:18
  • How does Java deal with the cache-miss problem when storing the references, not the objects? The cache can only be used for the pointer in that case. After grabbing the pointer from the array, it "has to go" to the main memory, not the CPU cache. I wonder whether Java solves or not. – Ali Berat Çetin Jun 14 '22 at 11:28
  • @AliBeratÇetin If your question is about the consistency of state accessed through pointers, that is only an issue across threads. Among the solutions is using the `AtomicReference` class, or using the keyword `volatile`. Mandatory reading on the matter: [*Java Concurrency in Practice*](https://jcip.net/) by Brian Goetz, et al. – Basil Bourque Jun 14 '22 at 14:19
  • @BasilBourque mmm no. I mean if the java arrays consist of pointers, the real objects that we want to access are in an arbitrary order in the memory. So this leads huge cache miss rate. – Ali Berat Çetin Jun 14 '22 at 15:32
  • @AliBeratÇetin Object-oriented programming is not intended for maximum performance. If you want maximum performance with a small set of data, say a set of integers or floating-point numbers, then use an array of primitive values rather than objects. An `int[]` or `float[]` may fit into a CPU cache. But if you really care about managing cache-misses, perhaps you should be using a [systems programming language](https://en.m.wikipedia.org/wiki/System_programming_language) such as C, Rust, or Swift rather than Java. – Basil Bourque Jun 14 '22 at 15:54
0

Java deals with references to objects only. As such, there's no guarantee that the elements of an array will be contiguous in memory.

Edit: Guess this answer wasn't that clear. My bad. I meant that there's no guarantee that the objects themselves will be contiguous, in spite of the fact that the references will be, as 1-D arrays are stored contiguously. Still, Basil Bourque's answer perfectly explains how this works.

Ricardo Costeira
  • 3,171
  • 2
  • 23
  • 23
0

The data or the values are stored in the objects and the values are retrieved using the references of the objects. lemme clear one more thing arrays in Java are stored in the form of objects. so there is no doubt that objects stores values and accessed using reference variable of that particular object. Hope you got it.

  • If I got it, that means that all elements of the array actually will be contiguous in memory? I got another answer telling me that they won't be contiguous... I'm confused. – Joza100 Jul 30 '18 at 17:52
  • @Joza100 If it is array of reference type, IOW Object, array stores pointers. Pointers will be continuous. Actual Objects will be where ever. – hyde Jul 31 '18 at 17:33