15

I wrote the following code:

val src = (0 until 1000000).toList()
val dest = ArrayList<Double>(src.size / 2 + 1)    

for (i in src)
{
    if (i % 2 == 0) dest.add(Math.sqrt(i.toDouble()))
}

IntellJ (in my case AndroidStudio) is asking me if I want to replace the for loop with operations from stdlib. This results in the following code:

val src = (0 until 1000000).toList()
val dest = ArrayList<Double>(src.size / 2 + 1)
src.filter { it % 2 == 0 }
   .mapTo(dest) { Math.sqrt(it.toDouble()) }

Now I must say, I like the changed code. I find it easier to write than for loops when I come up with similar situations. However upon reading what filter function does, I realized that this is a lot slower code compared to the for loop. filter function creates a new list containing only the elements from src that match the predicate. So there is one more list created and one more loop in the stdlib version of the code. Ofc for small lists it might not be important, but in general this does not sound like a good alternative. Especially if one should chain more methods like this, you can get a lot of additional loops that could be avoided by writing a for loop.

My question is what is considered good practice in Kotlin. Should I stick to for loops or am I missing something and it does not work as I think it works.

Jayson Minard
  • 84,842
  • 38
  • 184
  • 227
rozina
  • 4,120
  • 27
  • 49

4 Answers4

16

If you are concerned about performance, what you need is Sequence. For example, your above code will be

val src = (0 until 1000000).toList()
val dest = ArrayList<Double>(src.size / 2 + 1)
src.asSequence()
    .filter { it % 2 == 0 }
    .mapTo(dest) { Math.sqrt(it.toDouble()) }

In the above code, filter returns another Sequence, which represents an intermediate step. Nothing is really created yet, no object or array creation (except a new Sequence wrapper). Only when mapTo, a terminal operator, is called does the resulting collection is created.

If you have learned java 8 stream, you may found the above explaination somewhat familiar. Actually, Sequence is roughly the kotlin equivalent of java 8 Stream. They share similiar purpose and performance characteristic. The only difference is Sequence isn't designed to work with ForkJoinPool, thus a lot easier to implement.

When there is multiple steps involved or the collection may be large, it's suggested to use Sequence instead of plain .filter {...}.mapTo{...}. I also suggest you to use the Sequence form instead of your imperative form because it's easier to understand. Imperative form may become complex, thus hard to understand, when there are 5 or more steps involved in the data processing. If there is just one step, you don't need a Sequence, because it just creates garbage and gives you nothing useful.

glee8e
  • 6,180
  • 4
  • 31
  • 51
  • Makes sense, but when I tried it, it was even slower with sequence. – rozina May 24 '17 at 10:25
  • 1
    That is... amazing. Are you sure that your benchmark is correct? I mean, with enough warm up, no JIT removing loops, repeated enough time so no edge case etc.. At least on my end the `Seqence` is faster. (About 25% faster) – glee8e May 24 '17 at 10:30
  • Might be I do not understand the compiler and JVM enough to be benchmarking correctly. I measure time before and after the loop. No warmup - what kind of warmup are we talking about? my results: sequence: 110 ms filter_map: 57 ms forEach: 48 ms for loop: 35 ms filter_map is using non sequence operations. – rozina May 24 '17 at 10:32
  • 2
    Sometimes several loops. even if they copy the whole collection, show good performance because of good locality of reference. See: https://stackoverflow.com/questions/35629159/kotlins-iterable-and-sequence-look-exactly-same-why-are-two-types-required – hotkey May 24 '17 at 10:32
  • 2
    @rozina [This SO post](https://stackoverflow.com/a/513259/5818889) explains how to right a correct benchmark on JVM. Though it's designed to educate Java programmers, it works for kotlin, too, since kotlin has JVM backends. Here is a gist of my test code: https://gist.github.com/Glease/900fe08d757631e97e230d90a9b4faa2 – glee8e May 24 '17 at 10:36
3

You're missing something. :-)

In this particular case, you can use an IntProgression:

val progression = 0 until 1_000_000 step 2

You can then create your desired list of squares in various ways:

// may make the list larger than necessary
// its internal array is copied each time the list grows beyond its capacity
// code is very straight forward
progression.map { Math.sqrt(it.toDouble()) }

// will make the list the exact size needed
// no copies are made
// code is more complicated
progression.mapTo(ArrayList(progression.last / 2 + 1)) { Math.sqrt(it.toDouble()) }

// will make the list the exact size needed
// a single intermediate list is made
// code is minimal and makes sense
progression.toList().map { Math.sqrt(it.toDouble()) }
mfulton26
  • 29,956
  • 6
  • 64
  • 88
  • The code was just a made up example. Still a cool trick :) Although, doesn' t creating progression create a new list? Probably with a loop? – rozina May 25 '17 at 10:55
  • 1
    No, a progression generates its elements when iterating it and only stores things like first, last, and step. – mfulton26 May 25 '17 at 11:42
1

My advice would be to choose whichever coding style you prefer. Kotlin is both object-oriented and functional language, meaning both of your propositions are correct.

Usually, functional constructs favor readability over performance; however, in some cases, procedural code will also be more readable. You should try to stick with one style as much as possible, but don't be afraid to switch some code if you feel like it's better suited to your constraints, either readability, performance, or both.

0

The converted code does not need the manual creation of the destination list, and can be simplified to:

val src = (0 until 1000000).toList()

val dest = src.filter { it % 2 == 0 }
              .map { Math.sqrt(it.toDouble()) }

And as mentioned in the excellent answer by @glee8e you can use a sequence to do a lazy evaluation. The simplified code for using a sequence:

val src = (0 until 1000000).toList()

val dest = src.asSequence()                      // change to lazy
              .filter { it % 2 == 0 }
              .map { Math.sqrt(it.toDouble()) }
              .toList()                          // create the final list

Note the addition of the toList() at the end is to change from a sequence back to a final list which is the one copy made during the processing. You can omit that step to remain as a sequence.

It is important to highlight the comments by @hotkey saying that you should not always assume that another iteration or a copy of a list causes worse performance than lazy evaluation. @hotkey says:

Sometimes several loops. even if they copy the whole collection, show good performance because of good locality of reference. See: Kotlin's Iterable and Sequence look exactly same. Why are two types required?

And excerpted from that link:

... in most cases it has good locality of reference thus taking advantage of CPU cache, prediction, prefetching etc. so that even multiple copying of a collection still works good enough and performs better in simple cases with small collections.

@glee8e says that there are similarities between Kotlin sequences and Java 8 streams, for detailed comparisons see: What Java 8 Stream.collect equivalents are available in the standard Kotlin library?

Jayson Minard
  • 84,842
  • 38
  • 184
  • 227