0

I am working in Scala programming language. I want to nullify the entire column of data frame.

If that is not possible, then I at least want to put an empty string

What is the efficient way to do any of the above two?

Note: I don't want to add new column but I want to do manipulation on an existing column

Thanks

user10360768
  • 225
  • 3
  • 14
  • Does this answer your question? [Add an empty column to Spark DataFrame](https://stackoverflow.com/questions/33038686/add-an-empty-column-to-spark-dataframe) – user10938362 Feb 08 '20 at 00:45

1 Answers1

1

You can directly use .withColumn with same column name and spark replaces the column.

import org.apache.spark.sql.functions._
val df=Seq(("1","a"),("2","b")).toDF("id","name")
df.show()
//+---+----+
//|id |name|
//+---+----+
//|1  |a   |
//+---+----+

val df1=df.withColumn("id",lit(null)) //to keep null value for id column
df1.show()
//+----+----+
//|id  |name|
//+----+----+
//|null|a   |
//+----+----+

val df2=df.withColumn("id",lit("")) //to keep empty string "" value for id column
df2.show()

//+---+----+
//|id |name|
//+---+----+
//|   |a   |
//+---+----+
notNull
  • 30,258
  • 4
  • 35
  • 50