Spark 2.0 Scala import statements

Question

On Spark 1.6.2 (Scala 2.10.5) the following code worked just fine in the shell:

import org.apache.spark.mllib.linalg.Vector
case class DataPoint(vid: String, label: Double, features: Vector)

The mllib Vector overshadowed the Scala Vector correctly.

However, on Spark 2.0 (Scala 2.11.8) the same code throws the following error in the shell:

<console>:11: error: type Vector takes type parameters
  case class DataPoint(vid: String, label: Double, features: Vector)

In order to make it work, I now have to name the class explicitly:

case class DataPoint(vid: String, label: Double,
  features: org.apache.spark.mllib.linalg.Vector)

Can someone please tell me what changed, and is Spark or Scala at fault here? Thanks!

They changed the way spark shell does imports, and there are outstanding bugs for it. Are you talking about running from shell? — som-snytt, Sep 16 '16 at 21:46
@som-snytt yes I'm running from shell - thanks - updated the question. Okay so it is most likely a bug then. — Roman, Sep 16 '16 at 21:50

score 4 · Accepted Answer · answered Sep 16 '16 at 22:37

The simplest solution is to this problem is a simple paste:

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.1.0-SNAPSHOT
      /_/

Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_102)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import org.apache.spark.mllib.linalg.Vector
import org.apache.spark.mllib.linalg.Vector

scala> case class DataPoint(vid: String, label: Double, features: Vector)
<console>:11: error: type Vector takes type parameters
       case class DataPoint(vid: String, label: Double, features: Vector)
                                                                  ^

scala> :paste
// Entering paste mode (ctrl-D to finish)

import org.apache.spark.mllib.linalg.Vector
case class DataPoint(vid: String, label: Double, features: Vector)

// Exiting paste mode, now interpreting.

import org.apache.spark.mllib.linalg.Vector
defined class DataPoint

thank you @zero323 - your solution does work! could you please also elaborate on what makes it work? — Roman, Sep 16 '16 at 23:38
The difference compared to working line by line is that a whole block is compiled together. You could basically do the same thing by putting everything in the same block like `{import ....; case class DataPoint(...)}` (I know, not useful) or wrap with a single objects. But if you ask how to fix this upstream I have no idea. Spark tinkers with shell in serious ways and there quite a few ugly bugs there including [case class monster](http://stackoverflow.com/q/35301998/1560062). — zero323, Sep 17 '16 at 10:40

Spark 2.0 Scala import statements

1 Answers1