0

Why is using lastIndex in intArray slower than using size - 1? The implementation of the lastIndex method uses get() = size - 1, but then why does it take so long?

fun main() {
    val tar = 2
    var nums = intArrayOf(0, 1, 2, 2, 3, 0, 4, 2)
    var count = 0
    var i = 0

    var begin = System.nanoTime()
    var n = nums.size - 1
    while (i <= n) {
        if (nums[i] != tar) {
            nums[count++] = nums[i]
        }
        i++
    }

    var end = System.nanoTime()
    println("${end - begin}")
    nums = intArrayOf(0, 1, 2, 2, 3, 0, 4, 2)
    count = 0
    i = 0

    begin = System.nanoTime()
    n = nums.lastIndex
    while (i <= n) {
        if (nums[i] != tar) {
            nums[count++] = nums[i]
        }
        i++
    }

    end = System.nanoTime()
    println("${end - begin}")
}

I tried googling and found nothing

TblPK
  • 11
  • 2
  • 1
    A lot of common pitfalls for improper benchmarking: small test samples, very few runs, reliance on `#nanoTime` whilst including non-critical code sections. I would be very suspect of the results from this test case as conclusive evidence of one being faster than the other. – Rogue Jul 10 '23 at 15:50
  • My assumption is that _size_ is calculated at compile time since _intArrayOf_ returns an immutable IntArray, and _lastIndex_ which uses generics is calculated at runtime. – lukas.j Jul 10 '23 at 16:38
  • 1
    This isn't a valid benchmark. Use a benchmarking library. @lukas.j, `lastIndex` does not involve generics at all. – Tenfour04 Jul 10 '23 at 17:23
  • The only difference in each case is a _single statement_ `n = nums.size - 1` vs `n = nums.lastIndex`. This single statement is likely to take so little time that it can't be measured. The call to `System.nanoTime()` itself probably takes _much_ longer than this single statement. – k314159 Jul 10 '23 at 17:30
  • 1
    See https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java - although it's very old, and in Java, it's still relevant. – k314159 Jul 10 '23 at 17:32

2 Answers2

2

As many people already said in comments, measuring the performance of JVM by writing such simple benchmarks is pretty much meaningless. Proper benchmarking requires running the code multiple times, doing a warmup, forking, making sure JIT optimizations are in use, but at the same time these optimizations don't remove our code, etc. Benchmarking generally requires some knowledge and skills. It is better to use tools like JMH which decrease the risk we do something wrong.

In your specific case I believe the difference is not even due to nums.size - 1 vs nums.lastIndex. The real reason is: when we get to the lastIndex line, it has to load the ArraysKt class where this function is located. You actually measured the class loading process, not lastIndex execution time. Add another lastIndex anywhere before the measured line, even for a different object, and the performance immediately improves.

Your case is a good example why we shouldn't write such benchmarks. We may accidentally measure something entirely different than we planned.

broot
  • 21,588
  • 3
  • 30
  • 35
  • But it is interesting: When only timing the actual lines _var n = nums.size - 1_ and _var n = nums.lastIndex_ and removing all the other code the difference is still roughly 1 to 10.000 (yes: one to ten thousand). If done with _listOf_ it is 1 to 2. – lukas.j Jul 10 '23 at 20:23
  • 1
    @lukas.j Do you mean if we replace an array with a list? Using `listOf()` internally loads the `ArraysKt` class. So this observation also supports the hypothesis that the performance hit is due to class loading. – broot Jul 10 '23 at 20:32
  • Yes, by changing from _intArrayOf_ to _listOf_ – lukas.j Jul 10 '23 at 20:38
  • 1
    Some of the comments in this answer are applicable only to JVM applications running under the JVM. For example, warmup is not applicable to applications compiled as Graal Native Image. Also, sometimes it can be useful to know how long something will take in total, _including_ class loading times. For example, a small Java text application running thousands of times in a shell script's loop. So it's not appropriate to _always_ ignore class loading times, but it usually is. – k314159 Jul 11 '23 at 07:54
0

Indeed, using lastIndex is slower, but the order of magnitude is quite surprising . So, let's deep dive into it :)

I slightly change your test, to move a "critical sections" to functions, and not use IO, variables reassignment, allocation, etc. while testing:

// actually, we should remove time-calculation from those methods
// as they are not important in the test we perform here
// 
// btw. kotlin supports time measurment in its standard library :)
// https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.system/measure-time-millis.html 

fun bySizeMinusOne(): Long {
    val tar = 2
    val nums = intArrayOf(0, 1, 2, 2, 3, 0, 4, 2)
    var count = 0
    var i = 0

    val begin = System.nanoTime()
    val n = nums.size - 1
    while (i <= n) {
        if (nums[i] != tar) {
            nums[count++] = nums[i]
        }
        i++
    }

    val end = System.nanoTime()
    return end - begin
}

fun byLastIndex(): Long {
    val tar = 2
    val nums = intArrayOf(0, 1, 2, 2, 3, 0, 4, 2)
    var count = 0
    var i = 0

    val begin = System.nanoTime()
    val n = nums.lastIndex
    // val n = nums.size - 1
    while (i <= n) {
        if (nums[i] != tar) {
            nums[count++] = nums[i]
        }
        i++
    }

    val end = System.nanoTime()
    return end - begin
}

So, let's run our tests:

fun main() {
    val bySize = bySizeMinusOne()
    val byLastIndex = byLastIndex()

    println(bySize)
    println(byLastIndex)
}

Is printing:

1090
22942410

So, a next step to find out what's going on is to decompile our code. The only difference is

> size minus one
L5
 LINENUMBER 16 L5
 ALOAD 1
 ARRAYLENGTH
 ICONST_1
 ISUB
 ISTORE 6

> last index
L5  
 LINENUMBER 35 L5   
 ALOAD 1    
 INVOKESTATIC kotlin/collections/ArraysKt.getLastIndex ([I)I    
 ISTORE 6   

And here we are. It's not so simple to call another class, as JVM needs to load the class.

So, let's "improve" our testing by pre-loading methods/classes we need:

fun main() {

    // preload
    val array = intArrayOf() // load class
    array.size
    array.lastIndex

    val byLastIndex = byLastIndex()
    val bySize = bySizeMinusOne()

    println(bySize)
    println(byLastIndex)
}

Then we get something like:

420
1620

So the overhead is expected (we can see the difference in the bytecode).

There is another aspect - when compiling a production-ready jar, there can be a number of optimizations applied. Whenever testing such things, we need to remember to move the critical section to a function which is not including things we have mentioned here (IO, memory allocation etc.). And we should not rely on one result - we should run tests multiple time and check things like average, max/min, percentiles etc.

Cililing
  • 4,303
  • 1
  • 17
  • 35