Is it reasonable to use Scala's REPL for comparative performance benchmarks?

Question

Scala's REPL is a wonderful playground to interactively test certain pieces of code. Recently, I've been doing some performance comparisons using the REPL to repeatedly execute an operations and comparatively measure wall clock times.

Here's such an example I recently created to help answering an SO question [1][2]:

// Figure out the perfomance difference between direct method invocation and reflection-based method.invoke

def invoke1[T,U](obj:Any, method:Method)(param:T):U = method.invoke(obj,Seq(param.asInstanceOf[java.lang.Object]):_*) match { 
    case x: java.lang.Object if x==null => null.asInstanceOf[U]
    case x => x.asInstanceOf[U]
}

def time[T](b: => T):(T, Long) = {
    val t0 = System.nanoTime()
    val res = b
    val t = System.nanoTime() - t0
    (res,t )
}

class Test {
  def op(l:Long): Long = (2 until math.sqrt(l).toInt).filter(x=>l%x==0).sum
}

val t0 = new Test

val method = classOf[Test].getMethods.find(_.getName=="op").get

def timeDiff = {
  val (timeDirectCall,res) = time { (0 to 1000000).map(x=>t0.op(x)) }
  val (timeInvoke, res2) = time { (0 to 1000000).map(x=>{val res:Long=invoke1(t0,method)(x);res}) }
  (timeInvoke-timeDirectCall).toDouble/timeDirectCall.toDouble
}


//scala> timeDiff
//res60: Double = 2.1428745665357445
//scala> timeDiff
//res61: Double = 2.1604176409796683

In another case I've been generating MM of random data points to compare concurrency models for an open source project. The REPL has been great to play with different configurations without a code-compile-test cycle.

I'm aware of common benchmarks pitfalls such as JIT optimizations and the need for warm-up.

My questions are:

Are there any REPL specific elements to take into account when using it to perform comparative micro of macro benchmarks?
Are these measurements reliable when used relatively to each other? i.e. can they answer the question: is A faster than B ?
Are pre-eliminary executions of the same code a good warm up of the jit compiler?
Any other issues to be aware of?

[1] Scala reflection: How to pass an object's method as parameter to another method

[2] https://gist.github.com/maasg/6808879

REPL wraps your code into it's own innerrings (so you could redefine vals/vars/functions/classes/objects and do other nasty things) so basically what you will be measuring is the time to compile your code, time to wrap it and finally actual execution time full of different fluctations due to a [pile of reasons](http://www.ibm.com/developerworks/java/library/j-benchmark1/index.html) (but you said you're aware that last component unreliable). **Obviously such measurements are not reliable**. — om-nom-nom, Oct 07 '13 at 21:16
@om-nom-nom wrap & compile are basically one-offs, which will account for some overhead, but that will be the same overhead for any options being tested, so relative scores should still be representative, or not? e.g. in the example above it shows a rough 2x slower, which is good enough info. — maasg, Oct 07 '13 at 21:24

score 6 · Accepted Answer · edited May 23 '17 at 12:01

This is a great question. I can't imagine why anyone downvoted it.

The fact that one of the comments is totally wrong suggests that the REPL needs a place on scala-lang.org's faq or tutorial. I can't find the descriptive paper after a quick search.

The answer is yes, the REPL does what you expect.

Here is an old page on why the question is interesting: the REPL feels dynamic but is really statically compiled. It "straddles two worlds," as the extemporaneous comment on the linked page puts it.

The REPL compiles each line into its own wrapping object. Each such object imports symbols from the history of the interactive session, which is how code magically refers back to previous lines. Everything is compiled, so when it is run, it is run natively on the JVM, so to speak; there is not an extra layer of interpreter. That is the REPL's killer design feature.

That is why the answer to your question is yes, your code runs at the speed of compiled code. Invoking a method does not require recompiling all of history.

Here's another old link showing that other people have had the same question about timing and microbenchmarking.

There is currently an open issue to make it possible to customize how the REPL wraps lines of code. Microbenchmarking is an interesting use case, where code could be wrapped in an arbitrary framework for benchmarking. That will be coming soon.

The benchmark framework should take care of warm-ups. Since each expression submitted to the REPL is compiled separately (albeit by the same compiler), you would notice that a method could be invoked cold the first time and warm the second (modulo inlining by scalac).

Caveat:

Use -Yrepl-class-based or be careful not to put computations in the static initializer of the wrapping object.

Here is some sample confusion and here is the same question, less concealed.

Thanks for the great answer and pointers. To your knowledge, is there a difference between code `:paste`'d and entered line by line, in the way that it's wrapped? and should I prefer a method to the other? — maasg, Oct 08 '13 at 14:39
@maasg pasted code is wrapped in a single object and compilation unit (which is why companion objects must be pasted). In 2.11 :load file is line-by-line but :paste file is tout court. I just changed my -i init.script to :load imports.script, which is much faster than compiling each line. Referencing an object requires the usual $MODULE deref, but hardly a performance penalty. So there are a few compile-time edge cases, but no overhead at run-time at that level. — som-snytt, Oct 09 '13 at 00:04
typo: s/:load imports.script/:paste imports.script, obviously. One way to speed up the repl startup is to reduce the number of runs at init to one. — som-snytt, Oct 09 '13 at 06:41

Is it reasonable to use Scala's REPL for comparative performance benchmarks?

1 Answers1

Linked