3

I have a Spark DataFrame with two columns of types String and org.apache.spark.ml.linalg.SparseVector and this works fine:

data.map(r => r(1).asInstanceOf[Vector])

But getAs

data.map(r => r.getAs[Vector](1))

fails with

error: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.

Please can you explain why?

barclar
  • 523
  • 4
  • 11

1 Answers1

0

try this:

data.rdd.map(r => r.getAs[Vector](1))

for more information about Encoder and Dataset you can read this SO question

Zachary
  • 33
  • 6