0

I am trying to add a new String column to a dataframe with a default value of null (a non-null value will be applied later)

Here is my code

.withColumn("column-name", lit(null: String))

This creates a column with the Void type which I do not want

What is the easiest way to create a column of type String with null default value?

Note, the structure of the set of jobs is set in stone, and I am leaving this company very soon, so I am not interesting in arguing that the code should be restructured, I just want to give them the code they have asked for with the least fuss

Note also we aren't using a code-defined schema anywhere, it is pure schema inference

RF1991
  • 2,037
  • 4
  • 8
  • 17
sil
  • 433
  • 8
  • 20

1 Answers1

2

You can use lit with null, then cast it to your desired type.

Example

df.withColumn("test", lit(null).cast(StringType))

Output

+---+----+
|id |test|
+---+----+
|1  |null|
|2  |null|
|3  |null|
+---+----+

Schema

root
 |-- id: integer (nullable = false)
 |-- test: string (nullable = true)

Good luck!

vilalabinot
  • 1,420
  • 4
  • 17