5

From the design of Scala's collections I understand that something like:

scala> BitSet(1,2,3) map (_ + "a")
res7: scala.collection.immutable.Set[String] = Set(1a, 2a, 3a)

doesn't build an intermediate datastructure: the new Set is built as the BitSet is iterated over using a Builder. In fact in that case it is obvious since a bitset of strings doesn't make sense.

What about maps from lists? I am pretty sure that the following builds an intermediate list:

scala> List(1,2,3) map (_ -> "foo") toMap
res8: scala.collection.immutable.Map[Int,java.lang.String] =
    Map(1 -> foo, 2 -> foo, 3 -> foo)

namely the list List((1,foo), (2,foo), (3,foo)). If not, then how? Now, what about the following?

scala> Map.empty ++ (List(1,2,3) map (_ -> "foo"))
res10: scala.collection.immutable.Map[Int,java.lang.String] =
    Map(1 -> foo, 2 -> foo, 3 -> foo)

This time, from what I seem to understand from the type of ++:

def ++ [B >: (A, B), That]
       (that: TraversableOnce[B])
       (implicit bf: CanBuildFrom[Map[A, B], B, That]): That

I think it might be the case that the map is built on the fly, and that no intermediate list is constructed.

Is it the case? If yes, is this the canonical way of ensuring deforestation or is there a more straightforward syntax?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Paul Brauner
  • 1,307
  • 1
  • 10
  • 17

2 Answers2

14

You can use breakOut to ensure that no intermediate collection is created. For example:

// creates intermediate list.
scala> List((3, 4), (9, 11)).map(_.swap).toMap 
res542: scala.collection.immutable.Map[Int,Int] = Map(4 -> 3, 11 -> 9)

scala> import collection.breakOut
import collection.breakOut

// doesn't create an intermediate list.
scala> List((3, 4), (9, 11)).map(_.swap)(breakOut) : Map[Int, Int]
res543: Map[Int,Int] = Map(4 -> 3, 11 -> 9)

You can read more about it here.

UPDATE:

If you read the definition of breakOut, you'll notice that it's basically a way of creating a CanBuildFrom object of expected type and passing it explicitly to the method. breakOut merely saves you from typing the following boilerplate.

// Observe the error message. This will tell you the type of argument expected.
scala> List((3, 4), (9, 11)).map(_.swap)('dummy)
<console>:16: error: type mismatch;
 found   : Symbol
 required: scala.collection.generic.CanBuildFrom[List[(Int, Int)],(Int, Int),?]
              List((3, 4), (9, 11)).map(_.swap)('dummy)
                                                ^

// Let's try passing the implicit with required type.
// (implicitly[T] simply picks up an implicit object of type T from scope.)
scala> List((3, 4), (9, 11)).map(_.swap)(implicitly[CanBuildFrom[List[(Int, Int)], (Int, Int), Map[Int, Int]]])
// Oops! It seems the implicit with required type doesn't exist.
<console>:16: error: Cannot construct a collection of type Map[Int,Int] with elements of type (Int, Int) based on a coll
ection of type List[(Int, Int)].
              List((3, 4), (9, 11)).map(_.swap)(implicitly[CanBuildFrom[List[(Int, Int)], (Int, Int), Map[Int, Int]]])

// Let's create an object of the required type ...
scala> object Bob extends CanBuildFrom[List[(Int, Int)], (Int, Int), Map[Int, Int]] {
     |   def apply(from: List[(Int, Int)]) = foo.apply
     |   def apply() = foo.apply
     |   private def foo = implicitly[CanBuildFrom[Nothing, (Int, Int), Map[Int, Int]]]
     | }
defined module Bob

// ... and pass it explicitly.
scala> List((3, 4), (9, 11)).map(_.swap)(Bob)
res12: Map[Int,Int] = Map(4 -> 3, 11 -> 9)

// Or let's just have breakOut do all the hard work for us.
scala> List((3, 4), (9, 11)).map(_.swap)(breakOut) : Map[Int, Int]
res13: Map[Int,Int] = Map(4 -> 3, 11 -> 9)
Community
  • 1
  • 1
missingfaktor
  • 90,905
  • 62
  • 285
  • 365
  • Thanks, this is exactly what I was looking for. What I'm not sure of is why scala complains about `List((3, 4), (9, 11)).map(_.swap) : Map[Int,Int]` instead of using the type constraint I have explicitly put to pick the right implicit (exactly as read picks the right typeclass instance in haskell depending on the context). Is it just that implicit don't work that way (I would totally understand it, I know subtyping makes everything hard) or did I overlook something in that special case? – Paul Brauner Nov 21 '11 at 09:05
  • 3
    @DuncanMcGregor: I haven't closed the REPL since past 5 days. I use it heavily in my everyday development. :-) – missingfaktor Nov 21 '11 at 11:58
  • 3
    @DuncanMcGregor: My Windows finally crashed. :-D – missingfaktor Nov 21 '11 at 12:29
  • 1
    @missingfaktor My fault I'm sure http://www.lotusmuseum.com/files/Mousepad_VIP_Dilbert.JPG – Duncan McGregor Nov 21 '11 at 13:24
  • @DuncanMcGregor: LOL. Good one. :-) – missingfaktor Nov 21 '11 at 14:40
  • So there's no `implicitly[CanBuildFrom[List[(Int, Int)], (Int, Int), Map[Int, Int]]]` in scope but there is a `implicitly[CanBuildFrom[Nothing, (Int, Int), Map[Int, Int]]]`. Is that right? – Paul Brauner Nov 21 '11 at 17:06
  • Ok, I see, Nothing is a marker to say "not from any collection". It could as well be Foo it wouldn't change anything, as long as it doesn't clash with a collection, and that the convention is respected by everyone so that breakOut can pick it. – Paul Brauner Nov 21 '11 at 17:39
  • @PaulBrauner: Yes, you can think about it that way. – missingfaktor Nov 21 '11 at 18:05
3

Example 1) Correct, there's no intermediate List

2) Yes, you get an itermediate List.

3) Again yes, you get an intermeditate List from what you have in the parentheses. There's no "magic" going on. If you have something in parentheses, it gets evaluated first.

I'm not sure what you mean by "deforestation" here: according to Wikipedia it means eliminating tree structures. If you mean eliminating intermediate lists, you should use a view. See for example here: summing a transformation of a list of numbers in scala

So without intermediate results, your examples would be

BitSet(1,2,3).view.map(_ + "a").toSet

(toSet is required because otherwise you have an IterableView[String,Iterable[_]])

List(1,2,3).view.map(_ -> "foo").toMap

Map.empty ++ (List(1,2,3).view.map(_ -> "foo"))

There is also a force method for executing the transformation operations, but this seems to have a nasty habit of giving you a more general type back (perhaps someone can comment with a reason):

scala> Set(1,2,3).view.map(_ + 1).force
res23: Iterable[Int] = Set(2, 3, 4)
Community
  • 1
  • 1
Luigi Plinge
  • 50,650
  • 20
  • 113
  • 180
  • Deforestation means eliminating intermediate tree structures. A list is just a degenerate tree where the each node `::` has an element leaf as the left child and either another `::` node or a `Nil` leaf as the right child, so the term can be applied to lists as well. – hammar Nov 21 '11 at 04:41
  • Thanks. About 3) I was not expecting magic to happen but I was hoping the typing context would lead ++ into picking the right implicit for the job (exactly like read picks the right typeclass instance in haskell). – Paul Brauner Nov 21 '11 at 08:50