Parallelise a for that uses views (Scala)

Question

I'm very new to Scala! However, I have the following working solution to Euler Problem 4 that I would like to use par on, just to see if I can do it:

import scala.math

object Problem4 {
    def isPalindrome(x: Int): Boolean = {
        val string = x.toString
        string.reverseIterator.sameElements(string.iterator)
    }

    def getPairs(minimum: Int, maximum: Int) = {
        for (i <- minimum to maximum view;
             j <- minimum to maximum view)
             yield (i, j)
    }

    def getAnswer(numberOfDigits: Int): Int = {
        val maximum = math.pow(10, numberOfDigits).toInt
        val minimum = math.pow(10, numberOfDigits - 1).toInt
        val products = for {
                           pair <- getPairs(minimum, maximum)
                           product = pair match { case (i, j) => i * j } 
                           if isPalindrome(product)
                       } yield product
        products.par.max
    }

    def main(args: Array[String]) {
        val answer = getAnswer(4)
        println("Problem 4 answer: %s".format(answer))
    }
} // object Problem4

Project Euler 4 asks for 3-digit numbers, and I've noticed that finding the answer for 4-digit numbers takes 63 seconds on my PC and only uses one processor on my dual-core system. This is despite applying par to the end of the for expression.

How do I parallelise this using par? Ideally I'd like finding the answer for 4 digits to take 30-40 seconds. Thanks!

EDIT: I'm pretty sure getPairs returns a View:

    scala>     def getPairs(minimum: Int, maximum: Int) = {
         |         for (i <- minimum to maximum view;
         |              j <- minimum to maximum view)
         |              yield (i, j)
         |     }
    getPairs: (minimum: Int, maximum: Int)scala.collection.SeqView[(Int, Int),Seq[_]]

Moreover, adding par to the getPairs call returns a warning, still only uses one of my processors, and results in a java.lang.OutOfMemoryError: Java heap space exception:

[info] Loading project definition from M:\programming\testdriveneuler\src\problem4\project
[info] Set current project to euler (in build file:/M:/programming/testdriveneuler/src/problem4/)
[info] Compiling 1 Scala source to M:\programming\testdriveneuler\src\problem4\target\scala-2.9.2\classes...
[warn] M:\programming\testdriveneuler\src\problem4\src\main\scala\Problem4.scala:39: `withFilter' method does not yet exist on scala.collection.parallel.ParSeq[((Int, Int), Int)], using `filter' method instead
[warn]                            pair <- getPairs(minimum, maximum).par
[warn]                                 ^
[warn] one warning found

EDIT: I'm explicitly interested in calculating the answer to Euler problem 4 for the product of 2 4-digit numbers. For reference the answer is 99000099.

Don't use `view` except in the most simple cases. And, then, only use it if you can show you are gaining performance by doing so. The semantics of view can result in pessimization more easily than optimization, and, besides, it's buggy. — Daniel C. Sobral, May 10 '12 at 17:40
Daniel, thanks for your comment! Could you please provide references for the bugs in `view`? And in addition to `x to y view` would you also advise against using `x to y iterator`? — Asim Ihsan, May 10 '12 at 19:45
Search bug database for view bugs. Iterators are very reliable, and without the performance traps of views. — Daniel C. Sobral, May 10 '12 at 23:01

4e6 · Accepted Answer · 2012-05-10T15:47:39.853

3

So much complexity. It can be done with 2 functions

def isPalindrome(x: Int) = x.toString sameElements x.toString.reverse

def products(min: Int, max: Int) = for { 
  x <- min to max par; 
  y <- min to max par; 
  if isPalindrome(x * y) 
} yield (x, y, x * y)

scala> products(100, 999).maxBy(_._3)
res0: (Int, Int, Int) = (913,993,906609)

Update

(min to max).view returns SeqView, which represents lazy version of collection. (min to max).view.par returns ParSeq, parallel collection. In other words, calling par on lazy sequence will force it to evaluate. So, in this case you ought to choose between laziness and parallelism. It is hard to say what conversions are performed when you are moving from SeqView to ParSeq but this unnecessary complexity is causing OutOfMemoryError.

Update2

Yes, for is just a syntactic sugar over higher order operations on collections. Desugared version of for loop will be something like:

(min to max par) flatMap { x =>
  (min to max par)
    .filter(y => isPalindrome(x * y))
    .map(y => x * y)
}

edited May 10 '12 at 15:47

answered May 10 '12 at 14:17

4e6

10,696
4
52
62

This works for 4-digit numbers and doesn't result in a `OutOfMemory` exception and takes ~45 seconds, as opposed to ~63 seconds for the sequential case. @4e6, could you please explain to me why you didn't need to use `view` in your for loop? I don't understand that part. – Asim Ihsan May 10 '12 at 14:52
Wow, this answer only takes ~15 seconds when you use my optimised version of `isPalindrome`. I'm guessing because your version involves two strings, a reverse, and a comparison, whereas mine makes one string, two iterators, and a comparison. – Asim Ihsan May 10 '12 at 14:56
I think with this, http://stackoverflow.com/a/1059501/223301, I'm beginning to understand. `for` is just syntactic sugar, there's nothing special about it. When you break down the above into nested `flatMaps`s and a `withFilter`...I think my brain can just about figure out what's happening. – Asim Ihsan May 10 '12 at 15:27

score 1 · Answer 2 · answered May 10 '12 at 13:57

1

Add the .par at the call to getPairs;

pair <- getPairs(min, max).par

I suspect (but can't be sure on my phone) that the getPairs method is not returning a view; hence your for-loop performs your computation.

A simple way of verifying this is to leave out the last line (i.e. products.par.max - the evaluation of the for-loop) and see whether your program still performs the computation.

answered May 10 '12 at 13:57

oxbow_lakes

133,303
56
317
449

I've edited my answer; `getPairs` definitely returns a `View` and adding `par` to the `getPairs` call throws a warning, does not parallelise the operation, and results in a OutOfMemory "Java heap space" exception. – Asim Ihsan May 10 '12 at 14:07
Leaving out the last line as you've suggested results in the functional returning immediately. Does this imply that the for-loop does not perform the computation because its use of views makes it lazy? – Asim Ihsan May 10 '12 at 14:13

Luigi Plinge · Answer 3 · 2012-05-10T22:02:33.577

You can parallelize it like this:

  def isPalindrome (n: Int) = n.toString == n.toString.reverse
  val R = 100 until 1000
  val xs = for (i <- R; j <- R) yield i * j
  val pals = xs.par filter isPalindrome
  println (pals max)

(leave out the .par for non-parallel). However I found the parallel version is 3-4 times slower on my dual-core machine. Sometimes the overhead of parallelizing isn't worth it.

Edit: for giggles, here's a version using Akka (based on the Pi calculation tutorial). Its performance is slightly faster than using parallel collections as in 4e6's answer (on my machine 8.0s vs 9.1s), although that solution becomes almost identical in performance if you remove the .par on the second generator.

import akka.actor._
import akka.routing.RoundRobinRouter

sealed trait Euler4Message
case object Calculate extends Euler4Message
case class Work(range1: Seq[Int], range2: Seq[Int]) extends Euler4Message
case class Result(value: Int) extends Euler4Message
case class FinalResult(value: Int, duration: Long)

class Worker extends Actor {

  def calculate(r1: Seq[Int], r2: Seq[Int]): Int = {
    def isPalindrome(x: Int) = {
      val s = x.toString
      s.reverseIterator.sameElements(s.iterator)
    }
    val pals = for (i <- r1; j <- r2; if isPalindrome(i * j)) yield i * j
    pals.max
  } 

  def receive = { case Work(r1, r2) => sender ! Result(calculate(r1, r2)) }   
}

class Master(nrOfDigits: Int, nrOfWorkers: Int, chunkSize: Int) extends Actor {

  var nrOfResults: Int = 0
  var maxResult = 0
  var sentAll = false
  var nrMessages = 0

  val start: Long = System.currentTimeMillis
  val min = math.pow(10, nrOfDigits - 1).toInt
  val max = math.pow(10, nrOfDigits).toInt 
  val range = min until max
  val workerRouter = 
    context.actorOf(Props[Worker].withRouter(RoundRobinRouter(nrOfWorkers)))

  def receive = {
    case Calculate =>
      for (i <- range.grouped(chunkSize)) { 
        // grouped produces an Iterator, so is 'lazy'
        workerRouter ! Work(i, range)
        nrMessages += 1
      }
      sentAll = true

    case Result(value) =>
      if (value > maxResult) maxResult = value
      nrOfResults += 1
      if (sentAll && nrOfResults == nrMessages) {
         println("Result = "+ maxResult 
                +"\nTime in ms: "+ (System.currentTimeMillis - start))
         context.system.shutdown()
      }
  }
}

object Euler4 extends App {
  val master =  ActorSystem().actorOf(Props(new Master(4, 4, 50)))
  master ! Calculate
}

The good thing about actors is that you can use imperative code and still get parallelism. So, swap out the calculate method in the worker actor above for the below, and the whole thing completes in about 1.0s (an 8x improvement).

I found it works fastest with larger chunk sizes (try 1000), and make sure you have at least as many workers as processors.

  def calculate(r1: Seq[Int], r2: Seq[Int]): Int = {
    def isPalindrome(x: Int) = {
      val s = x.toString
      s.reverseIterator.sameElements(s.iterator)
    }
    var max = 0
    // count down so that we don't have to check if palindrome so often
    var i = r1.last
    while (i >= r1.head) {
      // for some reason counting down here increases the run-time :/
      var j = r2.head
      while (j <= r2.last) {
        val r = i * j
        if (r > max && isPalindrome(r)) max = r
        j += 1
      }
      i -= 1
    }
    max
  }

Are you sure you're not meant to use `view` for your `for` loop? Running this on my machine the `val xs` lines results in a `java.lang.OutOfMemory: Java heap space` exception. I think this is because `vs` is entirely in memory, instead of being lazy. Note that, as clarified in my question, I'm interested in Euler problem 4 for 4-digit numbers. — Asim Ihsan, May 10 '12 at 14:34
@Asim the above works fine on my machine, just using the default JVM settings. I get OutOfMemory if I try it with 4 digits. The trouble is, lazy and parallel don't really go well together - if you require both (and for it actually to be quicker), I think it's going to be more complicated than just adding a `.par`. — Luigi Plinge, May 10 '12 at 14:45
Do you know how complicated? I'm very interested in getting used to combining lazy and parallel expressions. If I can't do well I can't do it, but I thought the purpose of `par` was to drop it in anywhere and for expressions to magically parallelise. — Asim Ihsan, May 10 '12 at 14:48
@Asim I mean, you would have to roll your own parallelization, which wouldn't be too hard (just make a Worker class that processes a chunk, start a few in their own threads). This is what Actors do so using an actor library might be better. Note, 4e6's solution isn't lazy. It just gets around the memory problems by filtering **before** producing the output list, which is a great idea. — Luigi Plinge, May 10 '12 at 15:27
Haha you couldn't resist could you? :P. I'm a big fan of the Pi example in the Akka docs, I've been re-using it elsewhere as well. Thanks for your edit! — Asim Ihsan, May 11 '12 at 14:12
@Asim I've just been learning Akka so this is a case of "I have a hammer - oh, there's something I can hit!" — Luigi Plinge, May 11 '12 at 16:45

Parallelise a for that uses views (Scala)

3 Answers3

Update

Update2