When trying to simplify unit testing with Spark and Scala, I am using scala-test and mockito-scala (and mockito sugar). This simply lets you do something like this:
val sparkSessionMock = mock[SparkSession]
Then you can usually do all the magic with "when" and "verify".
But if you have some implementations that has the necessary import of
import spark.implicits._
in its code, then the simplicity of unit testing seems to be gone (or at least I didn't find the most proper way to solve this, yet).
I end up in getting this error:
org.mockito.exceptions.verification.SmartNullPointerException:
You have a NullPointerException here:
-> at ...
because this method call was *not* stubbed correctly:
-> at scala.Option.orElse(Option.scala:289)
sparkSession.implicits();
Simply mocking the call on the "implicits" object inside SparkSession won't help due to typing issues:
val implicitsMock = mock[SQLImplicits]
when(sparkSessionMock.implicits).thenReturn(implicitsMock)
will not let you pass, since it says it will require the type of the object inside your mock:
require: sparkSessionMock.implicits.type
found: implicitsMock.type
And please don't tell me that I should rather do SparkSession.builder.getOrCreate()... since then this isn't a unit-test anymore but a more heavy weight integration test.
(Edit): here is a complete reproducible example:
import org.apache.spark.sql._
import org.mockito.Mockito.when
import org.scalatest.{ FlatSpec, Matchers }
import org.scalatestplus.mockito.MockitoSugar
case class MyData(key: String, value: String)
class ClassToTest()(implicit spark: SparkSession) {
import spark.implicits._
def read(path: String): Dataset[MyData] =
spark.read.parquet(path).as[MyData]
}
class SparkMock extends FlatSpec with Matchers with MockitoSugar {
it should "be able to mock spark.implicits" in {
implicit val sparkMock: SparkSession = mock[SparkSession]
val implicitsMock = mock[SQLImplicits]
when(sparkMock.implicits).thenReturn(implicitsMock)
val readerMock = mock[DataFrameReader]
when(sparkMock.read).thenReturn(readerMock)
val dataFrameMock = mock[DataFrame]
when(readerMock.parquet("/some/path")).thenReturn(dataFrameMock)
val dataSetMock = mock[Dataset[MyData]]
implicit val testEncoder: Encoder[MyData] = Encoders.product[MyData]
when(dataFrameMock.as[MyData]).thenReturn(dataSetMock)
new ClassToTest().read("/some/path/") shouldBe dataSetMock
}
}