0

I want to iterate over the neighbors of cells in an 2D array. so i use:

for (int xOffset = -1; xOffset <= 1; xOffset++) {
    for (int yOffset = -1; yOffset <= 1; yOffset++) {
        if (xOffset == 0 && yOffset == 0) {
            continue;
        }
        //My Code
    }
}

My question is if that is the most efficient way or is it better to just use if calls? And if there is a better way than use this if inside the loops to get rid of the zero zero situation?

Oleg
  • 6,124
  • 2
  • 23
  • 40
TheSorm
  • 137
  • 12
  • What do you think is the most readable and maintainable solution? Is there an actual performance problem to solve here? – tnw Sep 14 '17 at 15:18
  • 1
    Yes its kind of a performance probleme and i search for all what i can optimice. this is running in an update loop for millions of objects. – TheSorm Sep 14 '17 at 15:24
  • So, again, is there an actual problem with performance using your current solution? – tnw Sep 14 '17 at 15:51
  • 1
    Its about finding a few nano or milliseconds in things that are used million times every tick – TheSorm Sep 14 '17 at 16:11
  • I agree which is why I ask if there is an actual performance problem with your current solution. Surely you're asking because you've timed it out and it's taking too long? – tnw Sep 14 '17 at 16:13
  • no i just go over everything that i think could be done faste because its hard to find out or to say if something is taking too long – TheSorm Sep 14 '17 at 16:16
  • Usually when people like @tnw ask questions like this, it's because they don't understand that in the field of real-time financial market data, even if your code is blindingly fast, if your competitor's code is a nanosecond faster then your code is too slow. – DodgyCodeException Sep 14 '17 at 16:16
  • 1
    I understand that but OP is asking about a problem which may or may not even exist. Maybe this is already the fastest solution. They seem to already have a few different solutions in mind so my challenge to OP is to just race your horses. – tnw Sep 14 '17 at 16:33

3 Answers3

2

Haven't tested it but the following is based on @Zefick's answer and should be more performant:

int dirs[] = { -1, -1, -1, 0, -1, 1, 0, -1, 0, 1, 1, -1, 1, 0, 1, 1 };

for (int i = 0; i < dirs.length; i += 2) {
    int xOffset = dirs[i];
    int yOffset = dirs[i + 1];
    // my code
}       
Oleg
  • 6,124
  • 2
  • 23
  • 40
2

I ran a JMH Benchmark with various solutions to the problem. I always consume the 2 offset indizes in a black hole. Here are the results:

Benchmark                                  Mode  Cnt         Score         Error  Units
Offset2DBenchmark.doubleLoopWithIf        thrpt    3  35425373,827 ± 4242716,439  ops/s
Offset2DBenchmark.loopOverFlatIndex       thrpt    3  35929636,197 ±  935681,592  ops/s
Offset2DBenchmark.loopOverIndex           thrpt    3  31438051,279 ± 3286314,668  ops/s
Offset2DBenchmark.unrolledLoop            thrpt    3  40495297,238 ± 6423961,118  ops/s
Offset2DBenchmark.unrolledLoopWithLambda  thrpt    3  27526581,417 ± 1712524,060  ops/s


doubleLoopWithIf       = Nested Loops with If to filter 0,0 (TheSorm)
loopOverFlatIndex      = Single loop with flattend indizes (Oleg)
loopOverIndex          = Single Loop with 2d indizes (Zefick)
unrolledLoop           = Completely Unrolled Loop
unrolledLoopWithLambda = Unrolled Loop consuming a Bifunction<Integer, Integer>

So, the unrolled loop is the fastest. Slightly slower are the double loop with the if statement and the flattened array proposed by Oleg. The 2D array by Zefick is even slower than your solution.

Just as a demo, here is what a tests looks like:

@Fork(value = 1)
@Warmup(iterations = 3)
@Measurement(iterations = 3)
@Benchmark
public void unrolledLoopWithLambda(Blackhole bh) {
    outerOffsets((x, y) -> {
        bh.consume(x);
        bh.consume(y);
    });
}

private void outerOffsets(BiConsumer<Integer, Integer> consumer) {
    consumer.accept(-1, -1);
    consumer.accept(-1, 0);
    consumer.accept(-1, 1);
    consumer.accept(0, -1);
    consumer.accept(0, 1);
    consumer.accept(1, -1);
    consumer.accept(1, 0);
    consumer.accept(1, 1);
}

So unless you want to unroll your loop manually, there is not much you can do to improve performance.

Unfortunately you didn't tell use what your code inside the loop looks like. If it is even a bit time consuming, the question how you loop might be neglectible...

But your title states that you want this as offsets for a 2d array. I suspect you might use this in a bigger loop over x and y. You might be able to gain more performance by using a 1d array and calculate the indices yourself.

Instead of:

array[x][y]

use

array[xy] with xy = x + y * xsize 

You should be able to avoid doing the multiplication at all.

(This is the first time I ran a benchmark with JMH, so please tell me if I used it completely wrong...)

J. Fuchs
  • 21
  • 2
  • wow thank you for all that work :) With unrolled loop you mean just one for loop didnt you? – TheSorm Sep 14 '17 at 18:24
  • 1
    With unrolled loop i meant, that you don't use loops at all. You copy&paste your code till you have 8 copies. Then you hardcode the values of xOffset and yOffset. Should be ok, since you are after performance. In terms of maintainability its as ugly as it can get. – J. Fuchs Sep 14 '17 at 18:27
  • Well done, probably possible to decrease the error deviation by increasing the number of warmups and itterations but even as it is the results make sense and show that not much can be optimized here. – Oleg Sep 14 '17 at 19:23
1

I usually use the following approach:

int dirs[][] = {{-1, -1}, {-1, 0}, {-1, 1}, {0, -1}, {0, 1}, {1, -1}, {1, 0}, {1, 1}};

for (int dir[] : dirs) {
    int xOffset = dir[0];
    int yOffset = dir[1];
    // my code
}
Zefick
  • 2,014
  • 15
  • 19
  • ok would take the xOffset, yOffset declaration outside the loop but seams a good thing thank you :) – TheSorm Sep 14 '17 at 15:25
  • 2
    @TheSorm That will be optimized by the JIT compiler(or just the normal compiler), no need to do it. – Oleg Sep 14 '17 at 15:26
  • @Oleg oh ok cool i am never shure what the compiler do so i try to do things like that by my own to get shure :D – TheSorm Sep 14 '17 at 15:28
  • 1
    @TheSorm Many questions on StackOverflow about it according to this answer: https://stackoverflow.com/a/407323/1398418 declaring inside the loop will even be **faster**! In general I recommend focusing on writing readable code, do micro optimizations only after you identified a problem. – Oleg Sep 14 '17 at 15:33
  • @Zefick I found out your implementation is only halve as fast as mine even when i put the dirs as a final global variable. But thank you for your help aniways :) – TheSorm Sep 14 '17 at 15:45