Jupyter Notebook (Scala, kernel - Apache Toree) with Vegas, Graph not showing data

Question

I'm using Jupyter (kernal - Apache Torre) for Analytics using Apache Spark/Scala. For visualization, I'm trying to use use Vegas (github - https://github.com/vegas-viz/Vegas)

When i use the sample Vegas code - without using the Vegas Spark extension, it works fine (pls see screenshot attached)

However, with DataFrames, it does not seem to be showing the graphs. (i.e. the Graph is not showing data)

Here is the code -

%AddDeps org.vegas-viz vegas_2.11 0.3.11 --transitive

%AddDeps org.vegas-viz vegas-spark_2.11 0.3.11

import vegas._
import vegas.render.WindowRenderer._
import vegas.data.External._
import vegas.sparkExt._

val seq = Seq(("a", 16), ("b", 77), ("c", 45), ("d",101),("e", 132),("f", 166),("g", 51))
val df = seq.toDF("id", "value")

df.show()

+---+-----+
| id|value|
+---+-----+
|  a|   16|
|  b|   77|
|  c|   45|
|  d|  101|
|  e|  132|
|  f|  166|
|  g|   51|
+---+-----+

val usingSparkdf = Vegas("UsingSpark")
  .withDataFrame(df1)
  .encodeX("id")
  .encodeY("value")
  .mark(Bar)

usingSparkdf.show

What am i doing wrong here ?

Is this the correct way to include Scala extension ?

 %AddDeps org.vegas-viz vegas-spark_2.11 0.3.11

saw you already found your problem but on top it seems like youre plotting `df1` while as written in your question you only defined `df` — dieHellste, Feb 19 '19 at 13:36

score 0 · Accepted Answer · answered Feb 11 '19 at 07:14

I was able to fix this issue, encodeX, encodeY should have the (statistical) number type specified i.e. Quant, Nom or Ord, along with Column name.

The code below works fine.

 val usingSparkdf = Vegas("UsingSpark")
      .withDataFrame(df1)
      .encodeX("id", Nom)
      .encodeY("value", Quant)
      .mark(Bar)

usingSparkdf.show

score 0 · Answer 2 · edited Jul 02 '19 at 20:41

package al.da.vg

object vegas_spark extends App {

  val conf = new SparkConf().setAppName("Vegas_Spark").setMaster("local[*]")
  val sc = new SparkContext(conf)
  val spark = SparkSession.builder().config(conf).appName("Vegas_Spark").getOrCreate()
  val sqlContext = new SQLContext(sc)
  import sqlContext.implicits._

  spark.sparkContext.setLogLevel("WARN")


  val seq1 = Seq(
    Map("a" -> "A", "b" -> 28), Map("a" -> "B", "b" -> 55), Map("a" -> "C", "b" -> 43),
    Map("a" -> "D", "b" -> 91), Map("a" -> "E", "b" -> 81), Map("a" -> "F", "b" -> 53),
    Map("a" -> "G", "b" -> 19), Map("a" -> "H", "b" -> 87), Map("a" -> "I", "b" -> 52))

  val df1 = seq1.toDF("a", "b")

  df1.show()

val usingSparkdf1 = Vegas("Vegas_Spark")
  .withDataFrame(df1)
  .encodeX("a", Ordinal)
  .encodeY("b", Quantitative)
  .mark(Bar)
  .show

}

Jupyter Notebook (Scala, kernel - Apache Toree) with Vegas, Graph not showing data

2 Answers2