1

Quick abstract : I am trying to display multiple histograms from Spark DataFrames with Vegas-viz in Scala. I created a trait to create different types of histograms, and implemented classes expending it. When I create an instance of a child class, I get a NullPointerException which makes me think there is a nested DataFrame somewhere.

Is there a workaround? Did I miss something and the error is something else?

Details : Here is the trait :

trait Histogram {

  val rawdf: DataFrame
  val sparseDim: Seq[String]
  val name: String

  val xColumn: String
  val yColumn: String

  val group: DataFrame

  val plot: ExtendedUnitSpecBuilder = Vegas(name).
    withDataFrame(group).
    encodeX(
      field = xColumn,
      Quantitative,
      scale = Scale(ScaleType.Log),
      title = sparseDim.reduce((a, b) => a + ", " + b)
    ).
    encodeY(field = yColumn, Quantitative).
    mark(Bar)

  def show(): Unit = plot.show

}

And here is one of the classes extending it :

class HistogramCount(val rawdf: DataFrame,
                     val sparseDim: Seq[String],
                     val name: String = "Histogram Count") extends Histogram {

  val xColumn = "cube"
  val yColumn = "count"

  override val group: DataFrame = rawdf.
    select("VALUE", sparseDim: _*).
    groupBy(sparseDim.head, sparseDim.tail: _*).
    count().
    withColumnRenamed("count", "cube").
    groupBy("cube").
    count()

}

When i create an instance of the child class, the following error occures :

Exception in thread "main" java.lang.NullPointerException
at <Pointing to .withDataFrame(group) in the trait>

I guess this is because the evaluation of group is lazy and that it is called in .withDataFrame(group) when plot is created.

I tried to evaluate the group DataFrame before Calling plot with a val evaluate: Long = group.rdd.count(), but it does not solve the issue.

Baptiste Merliot
  • 841
  • 11
  • 24

1 Answers1

0

Solved it by making the variable plot lazy. Still not sure if that is the best way thought.

Baptiste Merliot
  • 841
  • 11
  • 24