9

I am a new developer on Spark & Scala and I want to do an easy thing (I think..) :

  • I have 3 int values
  • I want to define a function that returns the result of an SQL request (as a DF containing 3 columns)
  • I want to store the content of each of those 3 columns in my 3 initial variables.

So, my code looks like this :

var a
var b 
var c

def myfunction() : (Int, Int, Int) = {
  val tmp = spark.sql(""" select col1, col2, col3 from table
  LIMIT 1
  """)

  return (tmp.collect(0)(0), tmp.collect(0)(1), tmp.collect(0)(2))

}

So, the idea if to call my function like this :

a, b, c = myfunction()

I tried a lot of configurations but I get many different errors each time, so, I got confused.

Krzysztof Atłasik
  • 21,985
  • 6
  • 54
  • 76
salamanka44
  • 904
  • 3
  • 17
  • 36
  • 3
    Also note that calling `collect` multiple times is very expensive, as everything has to be recomputed. I would just `spark.sql(...).as[(Int, Int, Int)].head` and remove the unnecessary `val temp` as well as the unsafe **return**. – Luis Miguel Mejía Suárez Nov 25 '19 at 00:43
  • Return a new Case class having an attribute for each wanted returning parameter. – Hartmut Pfarr Jan 15 '23 at 12:08

1 Answers1

10

You could just use destructuring bind. Since your method returns tuple you can unpack it using pattern matching:

val (a, b, c) = myfunction()

a, b and c will contain consecutive elements of the tuple.

Krzysztof Atłasik
  • 21,985
  • 6
  • 54
  • 76