1

I have two almost identical pieces of code. One is running Scala on JVM, second is running Javascript. Both perform lots of atan and asin calls (this is extracted from the real application performing quaternion to Euler angles conversion). The Javascript implementations runs an order of magnitude faster.

The JS version takes about 1 000 ms on my machine. The Scala code takes about 10 000 ms when running on JVM, but when compiled using Scala.js it is again running for about 1000 ms (see ScalaFiddle).

What is the reason for such huge performance difference? What changes would I have to implement for the JVM code run this fast?

var size = 100000;
var input = new Array(size);

function fromQuaternion(a, b, c) {
    return Math.asin(a) + Math.atan2(a, b) + Math.atan2(b, c);
}

function convert() {
    var sum = 0;
    for (var i = 0; i < size * 3; i += 3) {
        sum += fromQuaternion(input[i], input[i+1], input[i+2]);
    }
    return sum;
}

for (var i = 0; i < size * 3; i += 3) {
    input[i + 0] = Math.random();
    input[i + 1] = Math.random();
    input[i + 2] = Math.random();
}

var total = 0;
for (var i = 0; i < 10; i++) total += convert();

var start = Date.now();
for (var i = 0; i < 100; i++) total += convert();
var end = Date.now();
console.log("Duration " + (end - start));
console.log("Result " + total);
document.write("Duration " + (end - start));
  val input = Array.fill(100000) {
    val x = util.Random.nextDouble()
    val y = util.Random.nextDouble()
    val z = util.Random.nextDouble()
    (x, y, z)
  }

  def fromQuaternion(a: Double, b: Double, c: Double): Double = {
    Math.asin(a) + Math.atan2(a, b) + Math.atan2(b, c)
  }

  def convert = {
    input.foldLeft(0.0) { (sum, q) =>
      sum + fromQuaternion(q._1, q._2, q._3)
    }
  }

  // warmup
  var sum = 0.0
  for (_ <- 0 until 10) sum += convert

  val start = System.currentTimeMillis()
  for (_ <- 0 until 100) sum += convert
  val end = System.currentTimeMillis()

  println(s"Duration ${end - start} ms")
  println(f"Sum $sum%f")

When I measure asin and atan2 separately (with fromQuaternion containing only a single asin or a single atan2), I get following results:

  • JS atan2: 453 ms
  • JS asin 230 ms

  • Java Math atan2 1000 ms

  • Java Math asin 3800 ms

  • Apache FastMath atan2 1020 ms

  • Apache FastMath asin 1400 ms

I have tested Apache FastMath as well. While its asin is a bit faster, its performance is still way behind the one seen in the browser.

My measurements are done with Oracle Java 8 SDK 1.8.0.161 (JVM) and Chrome 78.0.3904.108 (Browser), both running on x64 Windows 10 running Intel Core i7 @ 3.2 GHz with 12 GB RAM.

Suma
  • 33,181
  • 16
  • 123
  • 191
  • 1
    See [this](https://stackoverflow.com/questions/39360403/how-can-node-js-be-faster-than-c-and-java-benchmark-comparing-node-js-c-java) – Shawn LaFrance Dec 07 '19 at 10:35
  • 4
    I bet @ShawnLaFrance's link is a big part of the answer. Also beware that benchmarking stuff, particularly with the JVM, is non-trivial, so that could be a factor as well. More [here](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java). – T.J. Crowder Dec 07 '19 at 10:38
  • See also [this article on the V8 blog](https://v8.dev/blog/react-cliff) about a more recent optimization of numbers since the 2017 answer above. – T.J. Crowder Dec 07 '19 at 10:48
  • 4
    The problem with this question is that unless you get down-and-dirty with both the Node.js bytecode (initially, from Ignition) and machine code (ultimately, from TurboFan), and you also get down-and-dirty with the JVM's bytecode and ultimately HotSpot's optimized machine code (assuming you're using HotSpot), it's only possible to answer with speculation/educated guesses like the above, which don't make proper *answers*. – T.J. Crowder Dec 07 '19 at 10:50
  • Also you compare the iteration over an array with numbers to an array of tuples (?) – Jonas Wilms Dec 07 '19 at 10:55
  • I am not talking about node.js at all. All my measurements were done in browser (recent Chrome). As for comparing iteration, Scala.js experiment uses exactly the same source code as JVM, showing it is not iteration which makes the different. – Suma Dec 07 '19 at 12:05
  • @ShawnLaFrance The question you have linked to seems to talk about integer processing (prime computation). No integers are present here, only random doubles, which should prevent using any less precision than double. – Suma Dec 07 '19 at 12:10
  • I would also note that the JVM doesn't optimize anything until, generally, a function has been called 10,000 times. A warmup of 10x won't actually warm anything up. – Levi Ramsey Dec 08 '19 at 16:51

0 Answers0