Why does a small change to this Scala code make such a huge difference to performance?

Question

I'm running on a 32-bit Debian 6.0 (Squeeze) system (a 2.5 GHz Core 2 CPU), sun-java6 6.24-1 but with the Scala 2.8.1 packages from Wheezy.

This code, compiled with scalac -optimise, takes over 30 seconds to run:

object Performance {

  import scala.annotation.tailrec

  @tailrec def gcd(x:Int,y:Int):Int = {
    if (x == 0)
      y 
    else 
      gcd(y%x,x)
  }

  val p = 1009
  val q = 3643
  val t = (p-1)*(q-1)

  val es = (2 until t).filter(gcd(_,t) == 1)
  def main(args:Array[String]) {
    println(es.length)
  }
}

But if I make the trivial change of moving the val es= one line down and inside the scope of main, then it runs in just 1 second, which is much more like I was expecting to see and comparable with the performance of equivalent C++. Interestingly, leaving the val es= where it is but qualifying it with lazy also has the same accelerating effect.

What's going on here? Why is performing the calculation outside function scope so much slower?

I see the same thing in 2.8.1 (Sun Java 1.6.0_24-b07) without `-optimize`, FWIW. — overthink, Jun 07 '11 at 13:37
Mentioning optimize at all was a bit of a red herring sorry; I've never actually seen using it (or not) make any significant difference to the performance of any Scala stuff I've worked on. — timday, Jun 07 '11 at 13:41
interesting...the same thing happens on a similar system with scala 2.9.0 with and without -optimize. — Kim Stebel, Jun 07 '11 at 13:45

Rex Kerr · Accepted Answer · 2011-06-07T13:43:55.737

53

The JVM doesn't optimize static initializers (which is what this is) to the same level that it optimizes method calls. Unfortunately, when you do a lot of work there, that hurts performance--this is a perfect example of that. This is also one reason why the old Application trait was considered problematic, and why there is in Scala 2.9 a DelayedInit trait that gets a bit of compiler help to move stuff from the initializer into a method that's called later on.

(Edit: fixed "constructor" to "initializer". Rather lengthy typo!)

edited Jun 07 '11 at 13:43

answered Jun 07 '11 at 13:38

Rex Kerr

166,841
26
322
407

2

Is that something Scala specific or generally true? Because I don't see why the JIT wouldn't optimize long static initializers in normal java code (or to be more exact why it would distinguish between static code blocks and normal) – Voo Jun 07 '11 at 16:22
1

@Voo, this is true in general, and affect even Java static codes. It seems that it is a JVM design; one possible hypothesis: given static initializers only run once, the JVM authors didn't want to invest as much time optimizing their use case. – notnoop Jun 07 '11 at 16:30
@Rex Kerr, would changing the code to `class Performance` and `object Performance extends Performance` also make optimizable? – pedrofurla Jun 07 '11 at 19:24
@pedrofurla - No, it's still executed in a static context. What you are suggesting is what the old `Application` trait was doing. If you turned it into a `lazy val` it would probably work in this case, since it wouldn't be evaluated until it was needed in the `main` method. – Rex Kerr Jun 07 '11 at 20:23
Yes simply changing the declaration to lazy speeds it up just as much as moving the declaration into main (and given that lazy does the job, I'm a little surprised there's considered to also be an additional need for something like DelayedInit). – timday Jun 07 '11 at 21:20
1

@timday - Adding lazy to every single expensive computation gets tiring after a while. It's nice to have a mechanism to do it all at once. – Rex Kerr Jun 07 '11 at 22:13
1

@notnoop I don't see why. Sure the static initializer isn't optimized - just as any main method that is only called once won't be. But if we have some complex computation inside why wouldn't that be JITed according to the usual rules as usual? – Voo Jun 08 '11 at 00:12
2

@notnoop. Static Initializers on other hand can really only be run once (per ClassLoader) and have non-trivial semantics: Check JLS 12.4.2. The JVM designers basically decided it wasn't worth the effort. The main method on the other require a lot of optimizations to actually make the code faster. Also, the `main` method just like any other static method can actually run multiple times (you can actually call the main method yourself.. shocking!). – notnoop Jun 08 '11 at 18:20

score 41 · Answer 2 · answered Jun 07 '11 at 13:40

41

Code inside a top-level object block is translated to a static initializer on the object's class. The equivalent in Java would be

class Performance{
    static{
      //expensive calculation
    }
    public static void main(String[] args){
      //use result of expensive calculation
    }
}

The HotSpot JVM doesn't perform any optimizations on code encountered during static initializers, under the reasonable heuristic that such code will only be run once.

answered Jun 07 '11 at 13:40

Dave Griffith

20,435
3
55
76

1

Yes this is fascinating. I come to this with the perspective of someone who last touched Java in the mid 90s, but was lured back to the JVM by Scala being a much more attractive language (to me anyway) than Java. So I do tend to get a nasty surprise when the realities of the underlying technology are unexpectedly revealed by things like this. – timday Jun 07 '11 at 14:10
4

Especially with immutability, I suspect it to be a common pattern to do much computation inside the initialization phase of a class. – ziggystar Jun 07 '11 at 14:37
4

@ziggystar - This specifically affects _static_ initializers. – Rex Kerr Jun 07 '11 at 16:42
5

And since it only affects static initializers, it only affects Scala singleton objects, not class objects – Dave Griffith Jun 07 '11 at 16:52

Why does a small change to this Scala code make such a huge difference to performance?

2 Answers2

Linked