3

Can some explain why the following happens? I mean if something has the type of String, then I expect to run head on it. But Set("ab").head works, whereas List("ab").toSet.head.head doesn't. Why?

$ scala
Welcome to Scala version 2.10.3 (Java HotSpot(TM) Server VM, Java 1.7.0_25).
Type in expressions to have them evaluated.
Type :help for more information.

scala> List("ab").toSet
res0: scala.collection.immutable.Set[String] = Set(ab)

scala> Set("ab")
res1: scala.collection.immutable.Set[String] = Set(ab)

scala> List("ab").toSet.head
res2: String = ab

scala> Set("ab").head
res3: String = ab

scala> List("ab").toSet.head.head
<console>:8: error: value head is not a member of type parameter B
              List("ab").toSet.head.head
                                    ^

scala> Set("ab").head.head
res5: Char = a
Emre Sevinç
  • 8,211
  • 14
  • 64
  • 105
  • 1
    I guess that type inference fails somewhere, because if you do `List("abc").toSet[String].head.head`, it works as expected... – Patryk Ćwiek Dec 05 '13 at 15:04
  • But where is the failure exactly? I see that both ```List("ab").toSet.head``` and ```Set("ab").head``` has the type ```String```. So I expect to run ```head``` on things that have the type ```String``` and get similar behavior. What am I missing? – Emre Sevinç Dec 05 '13 at 15:06
  • 1
    I don't know the details, you'll have to wait for someone else to fill in the blanks, but look at the signatures: `def toSet[B >: A]: Set[B]` and `def toList: List[A]` - my guess is that somehow, somewhere `B` is being wrongly inferred in the `toSet`. – Patryk Ćwiek Dec 05 '13 at 15:09
  • 1
    possible duplicate of [Type inference on Set failing?](http://stackoverflow.com/questions/5544536/type-inference-on-set-failing) – kiritsuku Dec 05 '13 at 15:11
  • I read possible duplicates, but I still can't wrap my head around this: both ```List("ab").toSet[String]``` and ```List("ab").toSet``` has return type of ```scala.collection.immutable.Set[String]```, and both ```List("ab").toSet.head``` and ```List("ab").toSet[String].head``` has return type ```String```. So why can't I appy ```head``` to something that is ```String```? – Emre Sevinç Dec 05 '13 at 15:24
  • 1
    even more perplexing is that `val s = List("ab").toSet;s.head.head` *works*. – Ian McLaird Dec 05 '13 at 15:45
  • You Sir, have added to my confusion and made my already shattered peace of mind even worse! :) Seriously. What's going on, how does that happen, and why? – Emre Sevinç Dec 05 '13 at 15:53
  • I'll go one step further. `(List("ab").toSet + "abc").head.head` *also works* – Ian McLaird Dec 05 '13 at 20:19

2 Answers2

2

DISCLAIMER: I'm not an expert on this, so at some level this is an educated guess, but I'm going to take a stab at this.

Let's take a look at what the compiler might (again, I don't actually know how it works internally) be thinking as it examines those lines. We'll start with something that works.

val s = List("ab").toSet

The compiler looks at that and says, OK, I've got a List[String] there, and the code says to call toSet on that. toSet is defined as toSet[B >: A], so I can think of that as toSet[B >: String]. All right, so s is a Set of some type B that is a supertype of String. OK, I've reached the end of the expression and I should make B the most specific type I can that encompasses everything. So B is String, and s is a Set[String].

Next up, let's look at let's look at what doesn't work.

List("ab").toSet.head.head

The compiler starts out the same way. I've got a List[String] and I'm calling toSet. That's defined as toSet[B >: A] so that means I'm working with a Set of some supertype of String. The expression then says to grab the head of the set. All right, I can do that. I don't know what type I can promise, though. I just know it's a supertype of String. What's he want next? Oh...he wants to call head on that...Bummer. I don't know if B has a head method.

Now, let's look at something else that works.

(List("ab").toSet + "abc").head.head

Compiler again starts the same way. I've got List[String] and I'm calling toSet[B >: String]. OK, so I'm working with a Set of Bs. So what's next. OK, he wants to call + on that, and he's calling that with something that I know is a String. Set is invariant, so if it's a set that can + a String, then my B I'm working with must be a String. Nailed it. I've got a Set[String]. Now he wants the head of that Set. Fantastic. Here's your String. Now he wants the head of that. Gotcha. Here's your Char.

Note that the following won't work.

(List("ab").toSet + "abc".asInstanceOf[Any]).head.head

The compiler follows the same path as the previous example, but discovers that you're trying to + an Any, so you end up with a Set[Any] instead, and the final head call fails.

Slight tangent. I'm not entirely sure why toSet is defined as toSet[B >: A], but I suspect that it has something to do with the fact that Sets are invariant. toList, toStream and toIterator are all defined with [A], and Lists, Streams, and Iterators are all covariant. toBuffer and toSet are both defined as [B >: A], and Buffer and Set are both invariant.

So great. How do we tell it what we really want? Our problems stem from the declaration of toSet[B >: A]. If it were toSet[A], we'd be home free. Fortunately, there's also a method called to[Col_]]: Col[A]. Which means that it returns a collection of the original type, but of a different collection type. That sounds funny, but basically it means that if you call

List("ab").to[Vector]

you get a Vector[String]. And, critically, if you call

List("ab").to[Set]

you get a Set[String] right away. So what you actually want to call in your example is

List("ab").to[Set].head.head
Ian McLaird
  • 5,507
  • 2
  • 22
  • 31
  • Your explanation is dead on. Type inference works on the expression as a whole, and at that point there is indeed a type error (calling `head` on an unbounded `B` that could be `AnyRef` or anything, really). In theory, it could then backtrack one step and retry (without the last `head`), and try to infer that system before going on with `head`, and one more step if that fails, and so on. This would probably be an implementation nightmare (not to mention slow). – gourlaysama Dec 06 '13 at 16:29
  • This makes me wonder: what would it be like in Haskell (whose type inference mechanism is different than Scala)? I mean, would it be possible to alleviate the problem and without leading to an implementation nigtmare and creating a very slow solution. – Emre Sevinç Dec 07 '13 at 16:30
0

This isn't a bulletproof answer, but it's too long for a comment, so I'll take a stab at it.

At first, I thought it might be something to do with implicit conversion of the String to StringLike (or whatever provides the head method), but the behavior is the same for length, getBytes, etc.

In addition to the workarounds mentioned already, there is this:

scala> List("abc").toSet.head.toString.head
res15: Char = a

In this case the compiler knows that B does have a toString method (everything does), and that this method must return a String, which it can use to call head. So storing an intermediate result is not, per se, what makes the call to head work in Ian's comment.

Then I looked in the List scaladoc and saw that toSet is declared with this signature:

def toSet[B >: A]: Set[B] 

So when you call toSet on a List of A, you get a Set of B, where B is a supertype of A. So B is not necessarily the same as A, and therefor does not necessarily have all the same methods. It appears that the compiler knows that List("ab").toSet.head must return a B, but does not yet know that B will be a String when it tries to compile the next call to head.

Beyond that, the two things that confuse me still are: (1) Why isn't the compiler able to resolve B to String as soon as it knows A is String? and (2) Why doesn't toSet just return a Set[A]? It's defined deep in the collection API in TraversableOnce as:

def toSet[B >: A]: immutable.Set[B] = to[immutable.Set].asInstanceOf[immutable.Set[B]]

And that "to" method is defined as:

  def to[Col[_]](implicit cbf: CanBuildFrom[Nothing, A, Col[A @uV]]): Col[A @uV] = {
    val b = cbf()
    b ++= seq
    b.result
  }

Which is where I start gasping for air. It's using an implicit convert from a TraversableOnce[A] to a Set[B] where B is a supertype of A. I still don't see why the B is necessary.

Matt Malone
  • 361
  • 4
  • 25
  • 1
    In answer to question 1. Because `B` *could* be `java.lang.Object`, and until something fixes the type lower in the hierarchy, that's all it knows. When all the constraints are collected, it'll use the most specific type it can muster; but that won't happen until the entire statement has been examined. The final call to `head` has to be resolved *first*. Or something. I'm still a bit confused. – Ian McLaird Dec 05 '13 at 20:28
  • Matt, I stil must be missing something crucial, because even though most of the sentences of your answer above makes sense, I still fail to grasp why ```val s = List("ab").toSet; s.head.head``` and ```(List("ab").toSet + "abc").head.head``` work. It might have to do with what Ian hints at by saying "... but that won't happen until the entire statement has been examined. The final call to head has to be resolved first. Or something." – Emre Sevinç Dec 05 '13 at 22:45