5

What is strongly-typed API and an untyped API with respect to Spark Datasets ?

How Datasets are similar/dissimilar to DataFrames?

Yaron
  • 10,166
  • 9
  • 45
  • 65
Arvind Kumar
  • 1,325
  • 1
  • 19
  • 27
  • Can anyone please answer this question. – Arvind Kumar Nov 23 '16 at 01:45
  • The link has explained difference between dataset and dataframe. http://stackoverflow.com/questions/31508083/difference-between-dataframe-and-rdd-in-spark/39033308?noredirect=1#comment68807827_39033308 – Arvind Kumar Nov 25 '16 at 01:56

1 Answers1

8

Dataframe API's are untyped API's since the type will only be known during the runtime. Whereas dataset API's are typed API's for which the type will be known during the compile time.

df.select("device").where("signal > 10")      // using untyped APIs   
ds.filter(_.signal > 10).map(_.device)         // using typed APIs
Vignesh I
  • 2,211
  • 2
  • 20
  • 40
  • perfect example @vignesh-i – Sandeep Samal Jun 04 '19 at 15:27
  • Correct me if im wrong but is this not the difference between dynamic typing and static typing? Strong Typing means that you cant perform any operation on any type. For example you cant cast a boolean to an int or add a boolean and a string. Here is a link explaining what im thinking: https://stackoverflow.com/questions/2690544/what-is-the-difference-between-a-strongly-typed-language-and-a-statically-typed) – vi_ral Jun 04 '20 at 18:04
  • Also to be fair no one really has a standard definition of Strongly Typed vs Statically Typed, looks like its been argued for a very long time :D – vi_ral Jun 04 '20 at 18:07