I am learning Scala in my free time - and as a learning exercise, I translated some OCaml code that I wrote about in another StackOverflow question to Scala. Since I am new to Scala, I'd appreciate some advice...
But before asking my questions - here's the original OCaml code:
let visited = Hashtbl.create 200000
let rec walk xx yy =
let addDigits number =
let rec sumInner n soFar =
match n with
| x when x<10 -> soFar+x
| x -> sumInner (n/10) (soFar + n mod 10) in
sumInner number 0 in
let rec innerWalk (totalSoFar,listOfPointsToVisit) =
match listOfPointsToVisit with
| [] -> totalSoFar
| _ ->
innerWalk (
listOfPointsToVisit
(* remove points that we've already seen *)
|> List.filter (fun (x,y) ->
match Hashtbl.mem visited (x,y) with
| true -> false (* remove *)
| _ -> (Hashtbl.add visited (x,y) 1 ; true))
(* increase totalSoFar and add neighbours to list *)
|> List.fold_left
(fun (sum,newlist) (x,y) ->
match (addDigits x)+(addDigits y) with
| n when n<26 ->
(sum+1,(x+1,y)::(x-1,y)::(x,y+1)::(x,y-1)::newlist)
| n -> (sum,newlist))
(totalSoFar,[])) in
innerWalk (0,[(xx,yy)])
let _ =
Printf.printf "Points: %d\n" (walk 1000 1000)
...and here's the Scala code I translated it to:
import scala.collection.mutable.HashMap
val visited = new HashMap[(Int,Int), Int]
def addDigits(number:Int) = {
def sumInner(n:Int, soFar:Int):Int =
if (n<10)
soFar+n
else
sumInner(n/10, soFar+n%10)
sumInner(number, 0)
}
def walk(xx:Int, yy:Int) = {
def innerWalk(totalSoFar:Int, listOfPointsToVisit:List[(Int,Int)]):Int = {
if (listOfPointsToVisit.isEmpty) totalSoFar
else {
val newStep =
listOfPointsToVisit.
// remove points that we've already seen
filter(tupleCoords => {
if (visited.contains(tupleCoords))
false
else {
visited(tupleCoords)=1
true
}
}).
// increase totalSoFar and add neighbours to list
foldLeft( (totalSoFar,List[(Int,Int)]()) )( (state,coords) => {
val (sum,newlist) = state
val (x,y) = coords
if (addDigits(x)+addDigits(y) < 26)
(sum+1,(x+1,y)::(x-1,y)::(x,y+1)::(x,y-1)::newlist)
else
(sum,newlist)
});
innerWalk(newStep._1, newStep._2)
}
}
innerWalk(0, List((xx,yy)))
}
println("Points: " + walk(1000,1000))
The Scala code compiles and works correctly, reporting the proper result.
But...
Unless I missed something, I found no pipe operator in Scala (i.e. the
|>
of OCaml and F#) so I used the corresponding list methods (filter
andfold Left
). In this case the end result is pretty close to the original, but I am wondering - isn't the pipe operator a generally favorable - and more generic - approach for functional-style solutions? Why isn't Scala equipped with it?In Scala, I had to specifically initiate my folding state (which is a tuple of
(Int, List[Int,Int])
with a type-specific empty list. In plain words,List()
didn't cut it - I had to explicitly specifyList[(Int,Int)]()
, otherwise I got a... rather difficult error message. I deciphered it based on context - it complained aboutNothing
- and I realized the only place in this tiny code where a typeNothing
appeared could be my empty List. Still, the result is uglier, compared to OCaml... Is there anything better I can do?In the same vein, OCaml was able to pass the fold's resulting tuple as an argument to
innerWalk
. In Scala, I had to assign to a variable and invoke the tail-recursive call withinnerWalk(newStep._1, newStep._2)
. There appears to be no equivalence between tuples and function arguments - i.e. I can't pass a tuple of 2-arity in a function with two arguments - and similarly, I can't tuple-destructure the arguments of a function to variables (I had to explicitely assignstate
andcoords
and de-structure them inside the folding function's body. Am I missing something?
Overall, I am pleased with the result - I'd say that if we grade the OCaml code of this example at 100%, then Scala is at about 85-90% - it's a bit more verbose than OCaml, but it's much, much closer to OCaml than it is to Java. I am just wondering whether I used Scala to its full potential or whether I missed some constructs that would improve the code (more likely).
Note that I avoided mapping my original OCaml's pattern matching to Scala's, since in this case I think it was overkill - an if
expression is much clearer in both places.
Thanks in advance for any help / suggestions.
P.S. A side note - I added timing instructions around the walk
call (thus avoiding the startup cost of the JVM) and measured my Scala code - it runs at about 50% of OCaml's speed - which is, funnily enough, exactly the same speed I get out of Mono executing the F# equivalent code (see my original SO question to get the F# code, if you care about this sort of comparison). Since I currently work in enterprise environments, 50% speed is a price I'll gladly pay to write concise ML-like code and still get access to the vastness of the JVM/.NET ecosystems (databases, Excel-file generation, etc). Sorry OCaml, I did try you - but you can't fully "speak" Oracle :-)
EDIT 1: After the kind suggestions from @senia and @lmm, the code is significantly improved. Hoping for more advice from @lmm about how foldMap and Shapeless will additionally help :-)
EDIT 2: I cleared up the code further with flatMap from scalaz - gist is here. Unfortunately, the change also caused a massive 10x slowdown - guessing that the list concatenation done by foldMap is much slower than foldLeft's "add just one new node". Wondering how I can change the code to make the addition fast...
EDIT 3: After another suggestion from @lmm, I switched the scalaz-flatMap version from using List
to using immutable.Vector: This helped a lot, bringing the speed from 10x slower back to... only 2x slower (than the original code). So, clean code or 2x speed? Decisions, decisions... :-)