2

I am trying to implement a data structure with Key, Value pairs and was looking at array implementations.

One way of achieving this is to declare seperate 1-D arrays for Key and Values.

  private int[] keys = new int[N];
private int[] values = new int[N];

But can the same be achieved by declaring a 2-D array as follows and not compromise on data locality?

private int[][] keysAndValues = new int[2][N];

It seems important here that Java implements multidimensional arrays in row-major order? Are there any performance advantages in declaring the array this way, or does this make the code less readable?

deepak
  • 375
  • 1
  • 3
  • 8
  • 1
    `int keysAndValues[2][N]` is C syntax, not Java. Java array types do not include array bounds. Also, Java allocates no storage for arrays on declaration; you need an explicit initializer. – Mike Samuel Dec 14 '14 at 01:28
  • Java does not actually have multi-dimension arrays. Rather, you have arrays of references to other arrays. Each array is uni-dimensional. – Hot Licks Dec 14 '14 at 04:32

3 Answers3

5

A 2D array in Java is actually an array of object references, each of which points to a 1D array. The 2D array, and each of the 1D arrays are all separate heap objects, and could (in theory) be anywhere in the heap.

(For a discussion of why, see: Why doesn't Java have true multidimensional arrays?)


But can the same be achieved by declaring a 2-D array as follows and not compromise on data locality?

Yes it can.

There is minimal difference in data locality between the two versions, especially if we can assume that N is large compared with 2. (And if we can't, then data locality is most likely irrelevant; i.e. the performance difference will be too small to be significant.)

It seems important here that Java implements multidimensional arrays in row-major order?

Is that a question? If yes, then I guess so. It is certainly relevant ... though if Java implemented them column-major, then you would simply flip the rows and columns and get an equivalent solution

Are there any performance advantages in declaring the array this way, or does this make the code less readable?

The performance issues are likely to be insignificant. But if it really, really matters then the best advice is to profile and optimize the code for yourself ... on REAL input datasets.

As for readability, that is up to you to judge. I can't predict what your code is going to look like.


If you really want to control the memory locality, then the best way is to use a single 1D array, and map the indexes in a way that gives you the best locality overall. (That will depend on your application and how it references the data in the array.)

Community
  • 1
  • 1
Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
1

to summarize from a quick google

"A single-dimension array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then array data (1 byte for boolean, 4 bytes for reference or however much the primitive type uses)

In Java a multidimensional array is actually a set of nested arrays. This means that every row of a two-dimensional array has the overhead of an object, since it actually is a separate object!" (edited/paraphrased)

So in essence, int[2][10] has overhead for int[2] (12 bytes) and then each int[10] inside. (2 * 12 bytes). this gives you 12 bytes more use than if you'd use: int[10] a; int[10] b;

This is likely never going to be an issue unless you plan to use a lot of arrays. I would personally go for readability as this is more likely going to be an issue. Optimize after the fact as optimizing beforehand is not likely going to yield the results you expect.

Joeblade
  • 1,735
  • 14
  • 22
0

You can achieve a better OOP solution using one of the many mapping types in java.util.

However, to create an array of unknown length in java, you need to use.

private int[][] keysAndValues;

then in your constructor

this.keysAndValues = new int[2][N];
twinlakes
  • 9,438
  • 6
  • 31
  • 42
  • 1
    Thanks @twinlakes . How is this represented in the memory ? Is there an advantage to using 2-d vs 1d arrays ? – deepak Dec 14 '14 at 01:33
  • 2
    If you looked at the bytes stored in memory of a 2D array and its equivalent 1D array, they would look the same. The only difference is ease of indexing. But like I said, if you're looking for the most clear/concise approach, you should look at the mapping types in `java.util` – twinlakes Dec 14 '14 at 01:37