When calling to a function from external class, in case of many calls, what will give me a better performance, lazy val
function or def
method?
So far, what I understood is:
def
method-
- Defined and tied to a class, needed to be declare inside "object" in order to be called as java static style.
- Call-by-name, evaluated only when accessed, and every accessed.
lazy val
lambda expression -
- Tied to object Function1/2...22
- Call-by-value, evaluated the first time get accessed and evaluated only one time.
- Is actually def apply method tied to a class.
So, it may seem that using lazy val will reduce the need to evaluate the function every time, should it be preferred ?
I faced that when i'm producing UDF for Spark code, and i'm trying to understand which approach is better.
object sql {
def emptyStringToNull(str: String): Option[String] = {
Option(str).getOrElse("").trim match {
case "" => None
case "[]" => None
case "null" => None
case _ => Some(str.trim)
}
}
def udfEmptyStringToNull: UserDefinedFunction = udf(emptyStringToNull _)
def repairColumn_method(dataFrame: DataFrame, colName: String): DataFrame = {
dataFrame.withColumn(colName, udfEmptyStringToNull(col(colName)))
}
lazy val repairColumn_fun: (DataFrame, String) => DataFrame = { (df,colName) =>
df.withColumn(colName, udfEmptyStringToNull(col(colName)))
}
}