2

In this question I was told how to print a dataframe using zeppelin's z.show command. This works well except for 'WrappedArray' appearing in the lemma column: enter image description here

I have tried this:

z.show(dfLemma.select(concat_ws(",", $"lemma")))

but it just gave me a list of words, not nicely formatted and I also want the racist column in my output. Any help is much appreciated.

schoon
  • 2,858
  • 3
  • 46
  • 78

1 Answers1

3

Here's a suggestion for formatting your array column:

import org.apache.spark.sql.Column
import org.apache.spark.sql.functions._
import sqlContext.implicits._

val df = Seq(
  (1, Array("An", "Array")), (2, Array("Another", "Array"))
).toDF("first", "second")

def formatArrayColumn(arrayColumn: Column): Column = {
  concat(lit("["), concat_ws(", ", arrayColumn), lit("]")).as(s"format(${arrayColumn.expr})")
}

val result = df.withColumn("second", formatArrayColumn($"second"))

z.show(result)

Which results in:

enter image description here

Daniel de Paula
  • 17,362
  • 9
  • 71
  • 72
  • Thanks Daniel but it just outputs a big string, (also when I select just two columns): import org.apache.spark.sql.functions.concat_wsformatArrayColumn: (arrayColumn: org.apache.spark.sql.Column)org.apache.spark.sql.Columnresult: org.apache.spark.sql.DataFrame = [racist: boolean, contributors: string ... 27 more fields]%table racist lemmafalse [rt, @dope_promo, :, when, you, and, you, crew, beat – schoon Jul 11 '17 at 13:26
  • @schoon So please be clear in your question about what you are expecting as output for this column. – Daniel de Paula Jul 11 '17 at 13:34
  • I was expecting what you have produced. But all I get is a long string of text. – schoon Jul 11 '17 at 14:37
  • @schoon can you update your question with what you tried, what was the result and what you expect? – Daniel de Paula Jul 11 '17 at 14:40
  • Never mind, it was because I had another show statement. When I got rid of it it worked! Thanks! – schoon Jul 11 '17 at 14:42