I am trying to teach myself Scala whilst at the same time trying to write code that is idiomatic of a functional language, i.e. write better, more elegant, functional code.
I have the following code that works OK:
import org.apache.spark.SparkConf
import org.apache.spark.sql.{DataFrame, SparkSession}
import java.time.LocalDate
object DataFrameExtensions_ {
implicit class DataFrameExtensions(df: DataFrame){
def featuresGroup1(groupBy: Seq[String], asAt: LocalDate): DataFrame = {df}
def featuresGroup2(groupBy: Seq[String], asAt: LocalDate): DataFrame = {df}
}
}
import DataFrameExtensions_._
val spark = SparkSession.builder().config(new SparkConf().setMaster("local[*]")).enableHiveSupport().getOrCreate()
import spark.implicits._
val df = Seq((8, "bat"),(64, "mouse"),(-27, "horse")).toDF("number", "word")
val groupBy = Seq("a","b")
val asAt = LocalDate.now()
val dataFrames = Seq(df.featuresGroup1(groupBy, asAt),df.featuresGroup2(groupBy, asAt))
The last line bothers me though. The two functions (featuresGroup1
, featuresGroup2
) both have the same signature:
scala> :type df.featuresGroup1(_,_)
(Seq[String], java.time.LocalDate) => org.apache.spark.sql.DataFrame
scala> :type df.featuresGroup2(_,_)
(Seq[String], java.time.LocalDate) => org.apache.spark.sql.DataFrame
and take the same val
s as parameters so I assume I can write that line in a more functional way (perhaps using .map
somehow) that means I can write the parameter list just once and pass it to both functions. I can't figure out the syntax though. I thought maybe I could construct a list of those functions but that doesn't work:
scala> Seq(featuresGroup1, featuresGroup2)
<console>:23: error: not found: value featuresGroup1
Seq(featuresGroup1, featuresGroup2)
^
<console>:23: error: not found: value featuresGroup2
Seq(featuresGroup1, featuresGroup2)
^
Can anyone help?