How can max() sometimes be a lot faster on a sequence of the same length?

Question

I'm asking this question solely to better understand how kotlin sequences work. I thought I had a solid grasp, but I cannot explain what I observed in a short test by what I know, so obviously I have a misconception somewhere.

My goal was to do a quick benchmark to compare the performance of lists vs. sequences when filtering for a criteria and then taking the maximum value of the result. This is an operation that occurs fairly often in some code I have, and I'm trying to decide whether or not it's worth rewriting it to use sequences instead of lists. It seems it would be, as sequence is consistently faster, but that is not the question here.

Rather, I would ask you to explain to me how the below described "artifact" can come about.

First of all, here's the complete test I ran:

fun `just checking the performance of sequences`() {
    val log = logger()
    var totaldif = 0L
    var seqFasterThanList = 0
    (1..1000).forEach {
        val testList = (1..6000000).toList().shuffled()
        val testSequence = testList.asSequence()

        // filter and find max
        val listDuration = measureTimeMillis {
            testList.filter { it % 2 == 0 }.max()
        }
        val sequenceDuration = measureTimeMillis {
            testSequence.filter { it % 2 == 0 }.max()
        }

        log.info("List: {} ms; Sequence: {} ms;", listDuration, sequenceDuration)
        if (sequenceDuration < listDuration) {
            seqFasterThanList++
            totaldif += (listDuration - sequenceDuration)
        }
    }

    log.info("sequence was faster {} out of 1000 runs. average difference: {} ms",
            seqFasterThanList, totaldif / seqFasterThanList)
}

The results mostly looked like this:

List: 299 ms; Sequence: 217 ms;
List: 289 ms; Sequence: 211 ms;
List: 288 ms; Sequence: 220 ms;

Except, every once in a while, about 1 in 20, the result looked more like this:

List: 380 ms; Sequence: 63 ms

As you can see, in these cases the operation was vastly faster. This is the kind of behaviour I would expect on operations like find or first, which can terminate early once they find a match. But by its very nature, max has to traverse the entire sequence to guarantee the result. So how is it possible that some times it can find a result more than 3 times as fast as it usually requires, with the same number of elements to traverse?

Could be more computer resources became available, the result of `suffle` lead to better branch prediction, the JIT compiler optimized something, etc... — Slaw, May 08 '19 at 10:20
what might also be interesting: if you skip the `.shuffled()`, you may get several peaks instead... — Roland, May 08 '19 at 10:46
and I also assume that's the JIT compiler optimizing something under the hood... maybe you want to do a microbenchmark instead of using `measureTimeInMillis`? — Roland, May 08 '19 at 10:51
Possible duplicate of [How do I write a correct micro-benchmark in Java?](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) — Michael, May 08 '19 at 12:12
@Michael, It's not a duplicate of that question, but the link is useful. — Ilya, May 08 '19 at 14:38
@Ilya every JVM performance question that is based on the false premise of a inaccurate measurement is a duplicate of that question — Michael, May 08 '19 at 14:41

Yoni Gibbs · Accepted Answer · 2019-05-08T12:08:29.880

Further down is my original answer which, as @Slaw pointed out, wasn't actually answering what you asked (it was explaining why Sequence.filter is faster than Iterable.filter, not why Sequence.filter seems to be intermittently faster than it normally is). However, I'm leaving it below as it's related to what I think might be the answer to your actual question.

My guess is this might be related to garbage collection. As you can see from my original answer, when you call Iterable.filter you are causing lots of arrays to be copied, i.e. you're putting lots of stuff in memory, which has to be cleaned up at certain points. I wonder if it's this cleanup of stuff in memory created by the List tests, which is actually causing the anomalies. I think what might be happening is that every so often the garbage collector kicks in and does a full collection: this is causing the List test to slow down to slower than normal. And after this runs the memory is all cleaned up, which might be why the Sequence test is faster that time.

And the reason I suspect it's related to garbage collection is because I replicated your anomalies, then made one change: instead of calling testList.filter I call testList.filterTo, passing in an ArrayList of the same size as the list. That means that no array copying has to happen, and also the creation of the ArrayList is now outside of the timing:

val arrayList = ArrayList<Int>(testList.size)
val listDuration = measureTimeMillis {
    testList.filterTo(arrayList) { it % 2 == 0 }.max()
}

As soon as I did that, the anomalies disappeared. Maybe you can check on your system and see if this makes the anomalies disappear there too. It's intermittent so a bit difficult to know for sure.

This doesn't prove that it's garbage collection, of course, but I think it makes it a possible culprit. You could turn on GC logging to see if you wanted to know for sure. If you do, let us know what you find: it would be interesting to hear your results.

Original answer below (explaining why Iterable.filter is slower than Sequence.filter)

If you look at the source code for Iterable<T>.filter you'll see it does this:

public inline fun <T> Iterable<T>.filter(predicate: (T) -> Boolean): List<T> {
    return filterTo(ArrayList<T>(), predicate)
}

It creates a new ArrayList then loops round the items, checking the predicate against each one, and adding them to that array list if they match the predicate. This means that every X items (whatever the array list's default size is), the array list has to resize itself to allow more items in (i.e. create a new copy of the underlying array in which it's storing all its data).

In a sequence, however, the code is different:

public fun <T> Sequence<T>.filter(predicate: (T) -> Boolean): Sequence<T> {
    return FilteringSequence(this, true, predicate)
}

Here there isn't some underlying array storing all the items, so no copying of arrays has to take place. Instead, there's just an Iterator which will return the next item which matches the predicate whenever next is called.

You can see the details of how this is implemented in the FilteringSequence class:

internal class FilteringSequence<T>(
    private val sequence: Sequence<T>,
    private val sendWhen: Boolean = true,
    private val predicate: (T) -> Boolean
) : Sequence<T> {

    override fun iterator(): Iterator<T> = object : Iterator<T> {
        val iterator = sequence.iterator()
        var nextState: Int = -1 // -1 for unknown, 0 for done, 1 for continue
        var nextItem: T? = null

        private fun calcNext() {
            while (iterator.hasNext()) {
                val item = iterator.next()
                if (predicate(item) == sendWhen) {
                    nextItem = item
                    nextState = 1
                    return
                }
            }
            nextState = 0
        }

        override fun next(): T {
            if (nextState == -1)
                calcNext()
            if (nextState == 0)
                throw NoSuchElementException()
            val result = nextItem
            nextItem = null
            nextState = -1
            @Suppress("UNCHECKED_CAST")
            return result as T
        }

This seems to explain why using a sequence is faster than not, but unless I'm missing something, how would this explain why some runs using a sequence are faster than other runs _also using a sequence_. I believe the question is about the latter. — Slaw, May 08 '19 at 10:17
A very thourough, clear and informative rundown. Thanks a lot! — UncleBob, May 08 '19 at 15:07

How can max() sometimes be a lot faster on a sequence of the same length?

1 Answers1