1

I'm passing in a parameter fieldsToLoad: List[String] and I want to load ALL columns if this list is empty and load only the columns specified in the list if the list has more one or more columns. I have this now which reads the columns passed in the list:

    val parquetDf = sparkSession.read.parquet(inputPath:_*).select(fieldsToLoad.head, fieldsToLoadList.tail:_*)

But how do I add a condition to load * (all columns) when the list is empty?

NoName
  • 1,509
  • 2
  • 20
  • 36

2 Answers2

1

You could use an if statement first to replace the empty with just *:

val cols = if (fieldsToLoadList.nonEmpty) fieldsToLoadList else Array("*")
sparkSession.read.parquet(inputPath:_*).select(cols.head, cols.tail:_*).
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
1

@Andy Hayden answer is correct but I want to introduce how to use selectExpr function to simplify the selection

scala> val df = Range(1, 4).toList.map(x => (x, x + 1, x + 2)).toDF("c1", "c2", "c3")
df: org.apache.spark.sql.DataFrame = [c1: int, c2: int ... 1 more field]

scala> df.show()
+---+---+---+
| c1| c2| c3|
+---+---+---+
|  1|  2|  3|
|  2|  3|  4|
|  3|  4|  5|
+---+---+---+


scala> val fieldsToLoad = List("c2", "c3")
fieldsToLoad: List[String] = List(c2, c3)                                                  ^

scala> df.selectExpr((if (fieldsToLoad.nonEmpty) fieldsToLoad else List("*")):_*).show()
+---+---+
| c2| c3|
+---+---+
|  2|  3|
|  3|  4|
|  4|  5|
+---+---+


scala> val fieldsToLoad = List()
fieldsToLoad: List[Nothing] = List()

scala> df.selectExpr((if (fieldsToLoad.nonEmpty) fieldsToLoad else List("*")):_*).show()
+---+---+---+
| c1| c2| c3|
+---+---+---+
|  1|  2|  3|
|  2|  3|  4|
|  3|  4|  5|
+---+---+---+
Minh Ha Pham
  • 2,566
  • 2
  • 28
  • 43
  • Thank you! This worked. Is there something similar to decide between coalesce and repartition based on a condition? – NoName Aug 18 '18 at 05:00
  • @NoName: please read this https://stackoverflow.com/questions/31610971/spark-repartition-vs-coalesce – Minh Ha Pham Aug 20 '18 at 02:12