Efficient implementation of multi-dimensional arrays in Java?

Question

As far as I understand (from answers such as this), java has no native multi-dimensional continuous memory arrays (unlike C#, for example).

While the jagged array syntax (arrays of arrays) might be good for most applications, I would still like to know what's the best practice if you do want the raw efficiency of a continuous-memory array (avoiding unneeded memory reads)

I could of course use a single-dimensional array that maps to a 2D one, but I prefer something more structured.

I feel like you are micro optimizing. Mapping a 1d array to a 2d array only removes 1 array lookup. I can't imagine it will save you much/any time. — jjnguy, Feb 10 '11 at 15:37
If you need every bit of performance, you may be better suited using C or C++. — helpermethod, Feb 10 '11 at 15:44
In what circumstances do you think that having each row of a two dimensional array being contiguous with the next in memory be *significantly* more efficient than java's normal arrray of arrays? — Highland Mark, Feb 10 '11 at 15:45
I'm not saying it's a _critical_ optimization, but it would be nice to know what the best practice is. — ripper234, Feb 10 '11 at 15:53

irreputable · Answer 1 · 2011-02-10T15:46:09.677

6

it's not difficult to do it manually:

int[] matrix = new int[ROWS * COLS];

int x_i_j = matrix[ i*COLS + j ];

now, is it really faster than java's multi dimension array?

int x_i_j = matrix[i][j];

for random access, maybe. for continuous access, probably not - matrix[i] is almost certainly in L1 cache, if not in register cache. in best scenario, matrix[i][j] requires one addition and one memory read; while matrix[i*COLS + j] may cost 2 additions, one multiply, one memory read. but who's counting?

edited Feb 10 '11 at 15:46

answered Feb 10 '11 at 15:40

irreputable

44,725
9
65
93

1

With a single array, however, you would only do bounds-checking once instead of twice. – Eric Giguere Feb 10 '11 at 16:01
if optimization caches matrix[i], it's one less bound check. – irreputable Feb 10 '11 at 17:09

score 4 · Answer 2 · answered Jun 04 '17 at 12:40

Let's say you have a 2D array int[][] a = new int[height][width], so by convention you have the indices a[y][x]. Depending on how you represent the data and how you access them, the performance varies in a factor of 20 :

The code:

public class ObjectArrayPerformance {
    public int width;
    public int height;
    public int m[];

    public ObjectArrayPerformance(int w, int h) {
            this.width = w;
            this.height = h;
            this.m = new int[w * h];
    }

    public int get(int x, int y) {
            return this.m[y * width + x];
    }

    public void set(int x, int y, int value) {
            this.m[y * width + x] = value;
    }

    public static void main (String[] args) {
            int w = 1000, h = 2000, passes = 400;

            int matrix[][] = new int[h][];

            for (int i = 0; i < h; ++i) {
                    matrix[i] = new int[w];
            }

            long start;
            long duration;

            System.out.println("duration[ms]\tmethod");

            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int y = 0; y < h; y++) {
                        for (int x = 0; x < w; x++) {
                                    matrix[y][x] = matrix[y][x] + 1;
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\t2D array, loop on x then y");

            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int x = 0; x < w; x++) {
                            for (int y = 0; y < h; y++) {
                                    matrix[y][x] = matrix[y][x] + 1;
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\t2D array, loop on y then x");

            //

            ObjectArrayPerformance mt = new ObjectArrayPerformance(w, h);
            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int x = 0; x < w; x++) {
                            for (int y = 0; y < h; y++) {
                                    mt.set(x, y, mt.get(x, y) + 1);
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\tmapped 1D array, access trough getter/setter");

            //

            ObjectArrayPerformance mt2 = new ObjectArrayPerformance(w, h);
            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int x = 0; x < w; x++) {
                            for (int y = 0; y < h; y++) {
                                    mt2.m[y * w + x] = mt2.m[y * w + x] + 1;
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\tmapped 1D array, access through computed indexes, loop y then x");

            ObjectArrayPerformance mt3 = new ObjectArrayPerformance(w, h);
            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int y = 0; y < h; y++) {
                        for (int x = 0; x < w; x++) {
                                    mt3.m[y * w + x] = mt3.m[y * w + x] + 1;
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\tmapped 1D array, access through computed indexes, loop x then y");

            ObjectArrayPerformance mt4 = new ObjectArrayPerformance(w, h);
            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int y = 0; y < h; y++) {
                        int yIndex = y * w;
                        for (int x = 0; x < w; x++) {
                                    mt4.m[yIndex + x] = mt4.m[yIndex + x] + 1;
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\tmapped 1D array, access through computed indexes, loop x then y, yIndex optimized");
    }
}

We can conclude that linear access performance depends more on the way you process the array (lines then columns or the reverse?: performance gain = x10, much due to CPU caches) than the structure of the array itself (1D vs 2D : performance gain = x2).

If random access, the performance differences should be much lower, because the CPU caches have less effect.

score 4 · Answer 3 · answered Feb 10 '11 at 16:40

4

It depends on your access pattern. Using this simple program, comparing an int[][] with a 2D mapped over a 1D int[] array treated as a matrix, a native Java 2D matrix is:

25% faster when the row is on the cache, ie: accessing by rows:
100% slower when the row is not in the cache, ie: accessing by colums:

ie:

// Case #1
for (y = 0; y < h; y++)
    for (x = 0; x < w; x++)
        // Access item[y][x]

// Case #2
for (x = 0; x < w; x++)
    for (y = 0; y < h; y++)
        // Access item[y][x]

The 1D matrix is calculated as:

public int get(int x, int y) {
    return this.m[y * width + x];
}

answered Feb 10 '11 at 16:40

vz0

32,345
7
44
77

nice. so vm optimization isn't as smart as I thought. – irreputable Feb 10 '11 at 17:12
On the contrary, JIT optimizes method calls more than just inlining them. Calling the get(x, y) method is faster than accesing the matrix from the outside directly. – vz0 Feb 10 '11 at 18:10

Edwin Buck · Answer 4 · 2011-02-10T15:52:03.283

If you really want more structure with a continuous-memory array, wrap it in an object.

public class My2dArray<T> {

  int sizeX;

  private T[] items;

  public My2dArray(int x, int y) {
    sizeX = x;
    items = new T[x*y];
  }

  public T elementAt(int x, int y) {
    return items[x+y*sizeX];
  }

}

Not a perfect solution, and you probably already know it. So consider this confirmation of what you suspected to be true.

Java only provides certain constructs for organizing code, so eventually you'll have to reach for a class or interface. Since this also requires specific operations, you need a class.

The performance impacts include creating a JVM stack frame for each array access, and it would be ideal to avoid such a thing; however, a JVM stack frame is how the JVM implements it's scoping. Code organization requires appropriate scoping, so there's not really a way around that performance hit that I can imagine (without violating the spirit of "everything is an object").

You could also generalize and drop the "2D" and put the dimensions as a var-arg argument to the constructor (and let the `elementAt` take a var-arg for the indices) — aioobe, Feb 10 '11 at 15:48
True, but the example demonstrates the key idea. Odds are good he has a specific dimensional array in mind. — Edwin Buck, Feb 10 '11 at 15:53

AMDG · Answer 5 · 2017-08-07T13:56:28.250

The most efficient method of implementing multi-dimensional arrays is by utilizing one-dimensional arrays as multi-dimensional arrays. See this answer about mapping a 2D array into a 1D array.

// 2D data structure as 1D array
int[] array = new int[width * height];
// access the array 
array[x + y * width] = /*value*/;

I could of course use a single-dimensional array that maps to a 2D one, but I prefer something more structured.

If you want to access array in a more structured manner, create a class for it:

public class ArrayInt {

    private final int[] array;
    private final int width, height;

    public ArrayInt(int width, int height) {
        array = new int[width * height];
        this.width = width;
        this.height = height;
    }

    public int getWidth() {
        return width;
    }

    public int getHeight() {
        return height;
    }

    public int get(int x, int y) {
        return array[x + y * width];
    }

    public void set(int x, int y, int value) {
        array[x + y * width] = value;
    }

}

If you wanted arrays of objects, you could use generics and define class Array<T>, where T is the object stored in the array.

Performance-wise, this will, in most cases, be faster than a multi-dimensional array in Java. The reasons can be found in the answers to this question.

score 1 · Answer 6 · answered Feb 10 '11 at 15:53

Sample implementation, without a compiler. This is basically what C/C++ do behind the scenes when you access multidimensional arrays. You'll have to further define accessor behaviour when less than the actual dimensions are specified & so on. Overhead will be minimal and could be optimized further, but thats microoptimizing imho. Also, you never actually know what goes on under the hood after JIT kicks in.

class MultiDimentionalArray<T> {
//disclaimer: written within SO editor, might contain errors
    private T[] data;
    private int[] dimensions; //holds each dimensions' size

    public MultiDimensionalArray(int... dims) {
        dimensions = Arrays.copyOf(dims, dims.length);
        int size = 1;
        for(int dim : dims)
            size *= dim;
        data = new T[size];
    }

   public T access(int... dims) {
       int idx = 1;
       for(int i = 0; i < dims.length)
            idx += dims[i] * dimensions[i]; //size * offset
       return data[idx];
    }
}

score -1 · Answer 7 · answered Feb 10 '11 at 15:43

-1

If you cannot live without C constructs, there's always JNI.

Or you could develop your own Java-derived language (and VM and optimizing JIT compiler) that has a syntax for multidimensional continuous-memory arrays.

answered Feb 10 '11 at 15:43

Michael Borgwardt

342,105
78
482
720

Efficient implementation of multi-dimensional arrays in Java?

7 Answers7

Linked