0

I am currently referring Spark in Action Book in that, I came across using same column in different ways.

val postsIdBody = postsDf.select('id, 'body)
val postsIdBody = postsDf.select($"id", $"body")
val postsIdBody = postsDf.select("id", "body")

we are able to get similar results. Is there any much difference between those? Can anyone clearly explain in what situations we need to implement each type of those.

Thanks in advance

moe
  • 1,716
  • 1
  • 14
  • 30
  • Using the $ sign is a spark-specific way of scala string interpolation. see https://stackoverflow.com/questions/35885702/sqlcontext-implicits and https://bzhangusc.wordpress.com/2015/03/29/the-column-class/ – moe Nov 16 '18 at 14:27

1 Answers1

1

I'm sure the book includes this, but by importing the implicits package in Scala, you can use these symbols to create Column objects without otherwise typing out new Column(name)

You would use column objects rather than strings because you can do ordering and aliasing easier within the dataframe API

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245