4

In the sample code below, why does the Iterable[String] test1 produce a Set after mapping?

val foo = Map("a" -> 1, "b" -> 1)
val test1: Iterable[String] = foo.keys
val test2: Iterator[String] = foo.keys.toIterator

println(test1.map(foo).size) // 1
println(test2.map(foo).size) // 2

I was puzzled by this because its entirely counter-intuitive when reading the code. Even though foo.keys just returns an Iterable, it creates a Set when calling map, as the reflection code shows:

println(test1.map(foo).getClass.getName) // immutable.Set.Set1
println(test2.map(foo).getClass.getName) // Iterator$$anon$11

How does the standard library determine that it should create an immutable.Set here, even though the inferred type of the collection is just Iterable[String]?

Chris
  • 2,057
  • 2
  • 16
  • 20

3 Answers3

5

foo.keys returns a Set (despite its return type being more general) and calling map on a Set produces another Set. The inferred or compile time type is not always the most precise.

You can see that the keys method on a Set returns a Set even though the return type is Iterable[A]:

scala> Map(1 -> 2).keys
res0: Iterable[Int] = Set(1)
Kim Stebel
  • 41,826
  • 12
  • 125
  • 142
  • If this were the case, it wouldn't have puzzled me. But in fact: foo.keySet returns a Set. foo.keys returns an Iterable. – Chris Aug 20 '16 at 16:16
  • 2
    @Chris `Set` is a subclass of `Iterable`, so `Map.keys` just calls `keySet` and returns the `Set` as an `Iterable`: https://github.com/scala/scala/blob/v2.11.8/src/library/scala/collection/MapLike.scala#L192 – Kolmar Aug 20 '16 at 16:24
  • 1
    I still didn't know why `set.map` results in a set. Like the OP, I thought the mechanism was encoded in static types. – som-snytt Aug 21 '16 at 04:25
2

Excavating Kolmar's comment that, although an implicit argument is in play that determines how the result collection is built, in this case the source collection is simply queried for the builder to use.

Iterable.map:

def map[B, That](f: (A) ⇒ B)(implicit bf: CanBuildFrom[Iterable[A], B, That]): That

Implicit scope includes types related to the type args, including Iterable and Int.

Iterable defines a "generic" CanBuildFrom that invokes genericBuilder on the source collection. That's how the result type is tied to the source.

Conversely, the result collection is divorced from the source by taking a CanBuildFrom[From = Nothing, _, _]. This is how cc.to[Set] is expressed, where a Set is built without regard for the source collection cc. For operations such as map, the method collection.breakOut provides such a CanBuildFrom, where the result type can be usefully inferred.

You can inject an arbitrary CanBuildFrom for the desired behavior:

$ scala
Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_92).
Type in expressions for evaluation. Or try :help.

scala> val m = Map("a" -> 1, "b" -> 1)
m: scala.collection.immutable.Map[String,Int] = Map(a -> 1, b -> 1)

scala> val k = m.keys
k: Iterable[String] = Set(a, b)

scala> import collection.{generic, mutable}, generic.{CanBuildFrom => CBF}, mutable.ListBuffer
import collection.{generic, mutable}
import generic.{CanBuildFrom=>CBF}
import mutable.ListBuffer

scala>   implicit def `as list`: CBF[Iterable[_], Int, List[Int]] =
     |     new CBF[Iterable[_], Int, List[Int]] {
     |       def apply() = new ListBuffer[Int]
     |       def apply(from: Iterable[_]) = apply()
     |     }
as$u0020list: scala.collection.generic.CanBuildFrom[Iterable[_],Int,List[Int]]

scala> k.map(m)
res0: List[Int] = List(1, 1)

Worth adding that completion can show types as of 2.11.8:

scala> k.map(m) //print<tab>

$line4.$read.$iw.$iw.k.map[Int, Iterable[Int]]($line3.$read.$iw.$iw.m)(scala.collection.Iterable.canBuildFrom[Int]) // : Iterable[Int]

Using breakOut:

scala> k.map(m)(collection.breakOut)
res1: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 1)

scala> k.map(m)(collection.breakOut) //print

$line4.$read.$iw.$iw.k.map[Int, scala.collection.immutable.IndexedSeq[Int]]($line3.$read.$iw.$iw.m)(scala.collection.`package`.breakOut[Any, Int, scala.collection.immutable.IndexedSeq[Int]](scala.Predef.fallbackStringCanBuildFrom[Int])) // : scala.collection.immutable.IndexedSeq[Int]

As shown, it actually picks up the CanBuildFrom intended for operations such as:

scala> "abc".map(_ + 1)
res2: scala.collection.immutable.IndexedSeq[Int] = Vector(98, 99, 100)

scala> "abc".map(_ + 1) //print

scala.Predef.augmentString("abc").map[Int, scala.collection.immutable.IndexedSeq[Int]](((x$1: Char) => x$1.+(1)))(scala.Predef.fallbackStringCanBuildFrom[Int]) // : scala.collection.immutable.IndexedSeq[Int]

Compare:

scala> k.map(m)(collection.breakOut) : List[Int] //print

(($line6.$read.$iw.$iw.k.map[Int, List[Int]]($line5.$read.$iw.$iw.m)(scala.collection.`package`.breakOut[Iterable[String], Int, List[Int]](scala.collection.immutable.List.canBuildFrom[Int]))): scala.`package`.List[scala.Int]) // : List[Int]

The canonical Q&A on breakOut.

Community
  • 1
  • 1
som-snytt
  • 39,429
  • 2
  • 47
  • 129
  • Also worth adding that the common mechanism to override tying the type to the source is to use `scala.collection.breakOut`. So `foo.keys.map(foo)(collection.breakOut)` will result in `Vector(1, 1): scala.collection.immutable.IndexedSeq[Int]`. This also allows type inference from the result, so `val l: List[Int] = foo.keys.map(foo)(collection.breakOut)` will result in a runtime `List(1, 1)`. – Kolmar Aug 21 '16 at 07:36
  • There's also a community answer option, but I don't know the keystrokes offhand. Have to rush off at the moment. – som-snytt Aug 21 '16 at 23:49
0

This is tricky implicit magic. Simplified answer: there exists CanBuildFrom value, that is passed in implicit scope. When compiler is searching for most common type, it looks for implicits in arguments' scope.

In your example, compiler is able to figure out that most common type for foo.keys is Set. This sounds reasonable: Set can be viewed as Map with absent values (also java's HashMap/HashSet do that). When you convert to iterable, implicits are lost, and Set is gone (as side note, those CanBuildFrom hacks are not robust, and might get gone in future, because they really complicate extending of existing collections, you might also want to read this answer and comments).

Scala shares Java's concept of "de-jure and de-facto types". "de-jure" is one declared in method definition, but "de-facto" might be one of inheritors. That's why, for example, you see Map.keys type as Iterable, when de-facto it is Set (produced from Map.keySet, which has Set de-jure type).

Last, 1 in first println is because in underlying map foo all values are the same, and Set(1,1) becomes Set(1).

Community
  • 1
  • 1
dveim
  • 3,381
  • 2
  • 21
  • 31
  • 4
    This answer is a bit wrong. It's not implicit magic, more like the usual OOP polymorphism magic. The compiler can't figure that `foo.keys` is a `Set`. It uses `CanBuildFrom` from an `Iterable`: `Iterable.canBuildFrom`, but that `canBuildFrom` calls `genericBuilder` on the actual collection object, and since, yes, this is a `Set` at runtime, this results in `Set.newBuilder` being used. – Kolmar Aug 20 '16 at 17:26
  • 1
    Also, I don't think there are any plans to have CBFs gone. The answer you have linked to doesn't support that claim. It says that they should be hidden in the docs by default, and that has already been implemented for a long time. – Kolmar Aug 20 '16 at 17:26
  • @Kolmar my guess about "no CBF" is based on recent discussion about new collections' proposal. That's why "might get gone". – dveim Aug 20 '16 at 17:34
  • Is it this proposal https://github.com/lampepfl/dotty/issues/818 ? That's interesting, thanks for pointing this out. – Kolmar Aug 20 '16 at 17:37
  • Yes, this proposal. – dveim Aug 20 '16 at 18:23