1

So I tried coming up with different implementations of pi estimation, based on montecarlo method. There are 3 implementations -

  1. Sequential - Faster performance results
  2. Scala parallel collections - Slowest results
  3. Scala futures - Faster than parallel collections, but slower than sequential

I did these benchmarks using a new m4xlarge instance on AWS, using Scalameter, running them through within sbt shell.

This is the sequential runner -

import java.lang.Math.random

object Runner extends App {

  val numberOfPoints = if (args.length > 0) {
    args(0).toInt
  } else {
    3 // Some default
  }

  import org.scalameter._

  def isWithinBounds(pair: (Double, Double)) = pair._1 * pair._1 + pair._2 * pair._2 < 1

  def piPredictor(numberOfPoints: Int) = {
    (1 to numberOfPoints).map(_ => (random(), random())).count(isWithinBounds) * 4.0 / numberOfPoints
  }

  def runExperiment(numberOfPoints: Int) = withWarmer(new Warmer.Default) measure {
    piPredictor(numberOfPoints)
  }

  def reporter(t: (Quantity[Double], Int)): Unit = println(s"It took ${t._1} for ${t._2} points")

  def raise10To(exponent: Int) = Math.pow(10, exponent).toInt

  (1 to numberOfPoints).map(raise10To).map(numberOfPoints => (runExperiment(numberOfPoints), numberOfPoints)).foreach(reporter)

}

This is the parallel collections runner - note the only real difference is using the par method.

import java.lang.Math.random
import scala.collection.parallel.CollectionConverters._

object Runner extends App {

  val numberOfPoints = if (args.length > 0) {
    args(0).toInt
  } else {
    3 // Some default
  }

  import org.scalameter._

  def isWithinBounds(pair: (Double, Double)) = pair._1 * pair._1 + pair._2 * pair._2 < 1

  def piPredictor(numberOfPoints: Int) = {
    (1 to numberOfPoints).par.map(_ => (random(), random())).count(isWithinBounds) * 4.0 / numberOfPoints
  }

  def runExperiment(numberOfPoints: Int) = withWarmer(new Warmer.Default) measure {
    piPredictor(numberOfPoints)
  }

  def reporter(t: (Quantity[Double], Int)): Unit = println(s"It took ${t._1} for ${t._2} points")

  def raise10To(exponent: Int) = Math.pow(10, exponent).toInt

  (1 to numberOfPoints).map(raise10To).map(numberOfPoints => (runExperiment(numberOfPoints), numberOfPoints)).foreach(reporter)

}

And finally this is the Runner with the Futures

import java.lang.Math.random

import scala.concurrent.duration.Duration
import scala.concurrent.{Await, ExecutionContext, Future}

object Runner extends App {

  implicit val executionContext: ExecutionContext = scala.concurrent.ExecutionContext.Implicits.global

  val numberOfPoints = if (args.length > 0) {
    args(0).toInt
  } else {
    3 // Some default
  }

  val numberOfThreads = if (args.length > 1) {
    args(1).toInt
  } else {
    2 // 2 threads as default
  }

  import org.scalameter._

  def isWithinBounds(pair: (Double, Double)) = pair._1 * pair._1 + pair._2 * pair._2 < 1

  def piPredictor(numberOfPoints: Int) = {
    def compute(n: Int) = (1 to n).map(_ => (random(), random())).count(isWithinBounds)
    val partsF: Seq[Future[Int]] = (1 to numberOfThreads).map(_ => Future(compute(numberOfPoints / numberOfThreads)))
    partsF.map(x => Await.result(x, Duration.Inf)).sum * 4.0 / numberOfPoints
  }

  def runExperiment(numberOfPoints: Int) = withWarmer(new Warmer.Default) measure {
    piPredictor(numberOfPoints)
  }

  def reporter(t: (Quantity[Double], Int)): Unit = println(s"It took ${t._1} for ${t._2} points")

  def raise10To(exponent: Int) = Math.pow(10, exponent).toInt

  (1 to numberOfPoints).map(raise10To).map(numberOfPoints => (runExperiment(numberOfPoints), numberOfPoints)).foreach(reporter)

}

This is the time taken for each of those benchmarks -

Sequential run reports on m4xlarge
sbt:scala-parallel-programs> run 7
[info] running montecarlo.sequential.Runner 7
It took 0.050859 ms for 10 points
It took 0.057532 ms for 100 points
It took 0.198189 ms for 1000 points
It took 1.391727 ms for 10000 points
It took 10.373825 ms for 100000 points
It took 75.328267 ms for 1000000 points
It took 1162.63124 ms for 10000000 points

Parallel Collections -

Parallel collections report - Very slow compared to sequential collecctions
info] running montecarlo.parallelcollections.Runner 7
It took 1.677748 ms for 10 points
It took 1.061964 ms for 100 points
It took 0.562431 ms for 1000 points
It took 3.582251 ms for 10000 points
It took 30.981812 ms for 100000 points
It took 332.464702 ms for 1000000 points
It took 3252.623377 ms for 10000000 points

And lastly Future runner -

Future implementation running times on m4xlarge

1 thread - on average, slower than sequential...
[info] running montecarlo.fixedfutures.Runner 7 1
It took 0.245685 ms for 10 points
It took 0.260486 ms for 100 points
It took 0.366152 ms for 1000 points
It took 0.799212 ms for 10000 points
It took 6.76789 ms for 100000 points
It took 94.150547 ms for 1000000 points
It took 1090.711087 ms for 10000000 points

2 threads - wayyy slower than sequential... I don't know, it makes no sense...
[info] running montecarlo.fixedfutures.Runner 7 2
It took 0.226309 ms for 10 points
It took 0.192723 ms for 100 points
It took 0.241403 ms for 1000 points
It took 2.342587 ms for 10000 points
It took 22.278208 ms for 100000 points
It took 229.475656 ms for 1000000 points
It took 2400.402471 ms for 10000000 points

10 threads - whatt.... this is weird...
[info] running montecarlo.fixedfutures.Runner 7 10
It took 0.193699 ms for 10 points
It took 0.31988 ms for 100 points
It took 0.62585 ms for 1000 points
It took 3.040552 ms for 10000 points
It took 24.241802 ms for 100000 points
It took 310.822098 ms for 1000000 points / runMain 38s
It took 3088.061321 ms for 10000000 points

I was expecting perrformance gains, because this is an plesantly parallel problem. And was mainly expecting good performance gain in operations like count

Dhruv Kapur
  • 726
  • 1
  • 8
  • 24

1 Answers1

2

java.math.random() is synchronized (as stated on the docs), which means your map is mostly sequential.

You may want to use java.util.concurrent.ThreadLocalRandom.current() instead.