0

For example, Scala array corresponds to ArrayType in Spark SQL, which can be used in schema definitions. For Map there's MapType.

How about Set?

CyberPlayerOne
  • 3,078
  • 5
  • 30
  • 51

2 Answers2

1

A set (no pun intended) of supported types is limited a not extensible. You'll find a full list of supported types in the Spark SQL, DataFrames and Datasets Guide - as you can check there is no type for Set.

The best you can do is to use ArrayType which maps to scala.collection.Seq and handle set specific operations yourself.

It is possible to use binary Encoders (How to store custom objects in Dataset?) but these are intended for strongly typed datasets, and have limited applications when used with DataFrames.

0

There is none. Exhaustive list is here: http://spark.apache.org/docs/latest/sql-programming-guide.html#data-types

Find full example code at "examples/src/main/scala/org/apache/spark/examples/sql/SparkSQLExample.scala" in the Spark repo.

Anurag Sharma
  • 2,409
  • 2
  • 16
  • 34