Questions tagged [apache-spark-encoders]
54 questions
165
votes
9 answers
How to store custom objects in Dataset?
According to Introducing Spark Datasets:
As we look forward to Spark 2.0, we plan some exciting improvements to Datasets, specifically:
...
Custom encoders – while we currently autogenerate encoders for a wide variety of types, we’d like to…

zero323
- 322,348
- 103
- 959
- 935
66
votes
3 answers
Why is "Unable to find encoder for type stored in a Dataset" when creating a dataset of custom case class?
Spark 2.0 (final) with Scala 2.11.8. The following super simple code yields the compilation error Error:(17, 45) Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported…

clay
- 18,138
- 28
- 107
- 192
43
votes
4 answers
Encoder error while trying to map dataframe row to updated row
When I m trying to do the same thing in my code as mentioned below
dataframe.map(row => {
val row1 = row.getAs[String](1)
val make = if (row1.toLowerCase == "tesla") "S" else row1
Row(row(0),make,row(2))
})
I have taken the above reference…

Advika
- 585
- 1
- 5
- 13
35
votes
2 answers
Encoder for Row Type Spark Datasets
I would like to write an encoder for a Row type in DataSet, for a map operation that I am doing. Essentially, I do not understand how to write encoders.
Below is an example of a map operation:
In the example below, instead of returning…

tsar2512
- 2,826
- 3
- 33
- 61
28
votes
2 answers
How to convert a dataframe to dataset in Apache Spark in Scala?
I need to convert my dataframe to a dataset and I used the following code:
val final_df = Dataframe.withColumn(
"features",
toVec4(
// casting into Timestamp to parse the string, and then into Int
…
user8131063
23
votes
3 answers
How to create a custom Encoder in Spark 2.X Datasets?
Spark Datasets move away from Row's to Encoder's for Pojo's/primitives. The Catalyst engine uses an ExpressionEncoder to convert columns in a SQL expression. However there do not appear to be other subclasses of Encoder available to use as a…

WestCoastProjects
- 58,982
- 91
- 316
- 560
18
votes
3 answers
Why is the error "Unable to find encoder for type stored in a Dataset" when encoding JSON using case classes?
I've written spark job:
object SimpleApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Simple Application").setMaster("local")
val sc = new SparkContext(conf)
val ctx = new…

Milad Khajavi
- 2,769
- 9
- 41
- 66
14
votes
2 answers
Encode an ADT / sealed trait hierarchy into Spark DataSet column
If I want to store an Algebraic Data Type (ADT) (ie a Scala sealed trait hierarchy) within a Spark DataSet column, what is the best encoding strategy?
For example, if I have an ADT where the leaf types store different kinds of data:
sealed trait…

Ben Hutchison
- 2,433
- 2
- 21
- 25
12
votes
1 answer
Apache Spark 2.0: java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate
I am using Apache Spark 2.0 and creating case class for mention schema for DetaSet. When i am trying to define custom encoder according to How to store custom objects in Dataset?, for java.time.LocalDate i got following exception:…

Harmeet Singh Taara
- 6,483
- 20
- 73
- 126
10
votes
3 answers
Convert scala list to DataFrame or DataSet
I am new to Scala. I am trying to convert a scala list (which is holding the results of some calculated data on a source DataFrame) to Dataframe or Dataset. I am not finding any direct method to do that.
However, I have tried the following process…

Leo
- 315
- 1
- 3
- 17
8
votes
1 answer
Spark Dataset : Example : Unable to generate an encoder issue
New to spark world and trying a dataset example written in scala that I found online
On running it through SBT , i keep on getting the following error
org.apache.spark.sql.AnalysisException: Unable to generate an encoder for inner class
Any idea…
user5131511
8
votes
1 answer
Spark Dataset and java.sql.Date
Let's say I have a Spark Dataset like this:
scala> import java.sql.Date
scala> case class Event(id: Int, date: Date, name: String)
scala> val ds = Seq(Event(1, Date.valueOf("2016-08-01"), "ev1"), Event(2, Date.valueOf("2018-08-02"), "ev2")).toDS
I…

Lukáš Lalinský
- 40,587
- 6
- 104
- 126
6
votes
2 answers
Rename columns in spark using @JsonProperty while creating Datasets
Is there way to rename the column names in dataset using Jackson annotations while creating a Dataset?
My encoder class is as follows:
import com.fasterxml.jackson.annotation.JsonProperty;
import lombok.*;
import scala.Serializable;
import…

Arjav96
- 79
- 3
6
votes
5 answers
How to map rows to protobuf-generated class?
I need to write a job that reads a DataSet[Row] and converts it to a DataSet[CustomClass]
where CustomClass is a protobuf class.
val protoEncoder = Encoders.bean(classOf[CustomClass])
val transformedRows = rows.map {
case Row(f1: String, f2: Long…

Apurva
- 153
- 2
- 7
6
votes
1 answer
scala generic encoder for spark case class
How can I get this method to compile. Strangely, sparks implicit are already imported.
def loadDsFromHive[T <: Product](tableName: String, spark: SparkSession): Dataset[T] = {
import spark.implicits._
spark.sql(s"SELECT * FROM…

Georg Heiler
- 16,916
- 36
- 162
- 292