0

I'm starting with Java, and I just came up with this huge difference: why is making references to an array so much worse than assigning the value to a variable and then calling the variable? (If that's not the problem of my code, then what's the problem of my code?)

I tried to implement selection sort in Java and here I show you the times.

In my mind an array was like a list of symbolic variables so I don't see how is that much worse to call array items directly.

Here's the code (times below)

public static int[] selectionSort(int ... numbers) {
  int pos, value;
  int[] result = numbers.clone();
  for (int i = 0; i < result.length - 1; i++) {
    value = result[i];
    pos = i;
    for (int j = i + 1; j < result.length; j++) {
      if (result[j] < value) {
        value = result[j];
        pos = j;
      }
    }
    result[pos] = result[i];
    result[i] = value;
  }
  return result;
}

public static int[] selectionSortWorse(int ... numbers) {
  int pos, value;
  int[] result = numbers.clone();
  for (int i = 0; i < result.length - 1; i++) {
    pos = i;
    for (int j = i + 1; j < result.length; j++) {
      if (result[j] < result[pos]) {
        pos = j;
      }
    }
    value = result[pos];
    result[pos] = result[i];
    result[i] = value;
  }
  return result;
}

Time comparison (average of 10) of selectionSort vs. selectionSortWorse

  • length of array: 10000; times 33.14 ms vs. 52.33 ms
  • length of array: 20000; times 127.13 ms vs. 207.46 ms
  • length of array: 50000; times 793.56 ms vs. 1302.48 ms
  • length of array: 100000; times 3096.43 ms vs. 5072.56 ms
Manuel
  • 301
  • 2
  • 11
  • 4
    Usually relevant: [How do I write a correct micro-benchmark in Java?](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) – Andy Turner Dec 19 '18 at 14:37
  • Accessing an array element is a bit slower than accessing a simple variable and you do that a lot in this program. So you get a big difference. – Henry Dec 19 '18 at 14:41
  • 1
    The code of `result[pos]` loads two variables and does a more complicated address calculation, than the code of `value`. The condition means that the if-statement is not entered half of the time. With an other language like C one might expect optimized code generated. Typically a java compiler does not do this. – Joop Eggen Dec 19 '18 at 14:44
  • 1
    @JoopEggen The JIT might very much do such optimisations. But that depends on data size, and specific configuration parameters. – GhostCat Dec 19 '18 at 14:45
  • 1
    Accessing a variable: you have an address -> you access it. Accessing an array: you have an address of an array's beginning, need to calculate the data size, shift the address to the dataSize * elementNumber -> you access it. – J-Alex Dec 19 '18 at 14:47
  • 1
    @GhostCat yes, a good point. Though here 100_000 with a 2 s cost is a bit heavy. – Joop Eggen Dec 19 '18 at 14:47
  • I guess the results depends on how sorted the array on input is. If the condition `if (result[j] < value)` is true very often, the difference might decrease drastically. – Robert Kock Dec 19 '18 at 14:53
  • If you look at the generated bytecode, your second version has an extra `ALOAD` (_retrieve object reference from local variable_) and `IALOAD` (_retrieve integer from array_) compared to your first version. If `result[pos]` had been a loop invariant, the compiler could've lifted the extra loads out of the inner loop, but that `pos = j` statement prevents that. – Michael Dec 19 '18 at 15:12

1 Answers1

2

When accessing an array element, for example result[pos], there is an overhead compared to accessing a direct variable, for example value, at least 2 more things have to happen:

  • Verifying range, for the example, verifying that pos >= 0 && pos < result.length
  • The address of the element needs to be calculated, something like "result address + (pos * element_size)" instead of just "value address". This shouldn't be significant, but if repeated enough might have an impact.
Roee Gavirel
  • 18,955
  • 12
  • 67
  • 94