0

I have got probably easy and quick question regarding the DataFrames in the Scala in Spark.

I have an existing Spark DataFrame (operate with Scala 2.10.5 and Spark 1.6.3) and would like to add a new column with the ArrayType or MapType, but don't know how to achieve that. But don't know how to deal with that. I would not like to create multiple columns with the 'single' values, but store them in one column. It would shorten my code and make it more changes prone.

import org.apache.spark.sql.types.MapType

...

// DataFrame initial creation
val df = ...

// adding new columns
val df_new = df
   .withColumn("new_col1", lit("something_to_add") // add a literal
   .withColumn("new_col2"), MapType("key1" -> "val1", "key2" -> "val2")) // ???
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Tomasz
  • 610
  • 4
  • 22
  • 1
    Possible duplicate of [How to add a constant column in a Spark DataFrame?](https://stackoverflow.com/questions/32788322/how-to-add-a-constant-column-in-a-spark-dataframe) – 10465355 Nov 20 '19 at 10:52

1 Answers1

1

You could try something like

val df_new = df
   .withColumn("new_col1", lit("something_to_add") // add a literal
   .withColumn("new_col2"), typedLit[Map[String, String]](Map("key1" -> "val1"), ("key2" -> "val2")))
dumitru
  • 2,068
  • 14
  • 23
  • Unfortunately my Spark 1.6.3 doesn't contain the org.apache.spark.sql.functions.typedLit. Or is the 'typedLit' stored in another package? – Tomasz Nov 20 '19 at 11:02
  • typedLit is from 2.2.0. You could try the following approach. Use a custom defined function on your column(e.g toMap) that return a Map[String, String] and use that like: withColumn("new_col2", toMap(lit("key1->val1"))) – dumitru Nov 20 '19 at 11:17