-3

I am trying to query a dataframe and add a column with a set value but I'm not sure how to get it to work. I know how it works in SQL but I could use help converting it to pyspark.

PL/SQL Example: SELECT 1 AS column1 ,2 AS column2 FROM dual;

enter image description here

pyspark: empDF.select("name", col("").alias("nullColumn")).display()

enter image description here

Lamanus
  • 12,898
  • 4
  • 21
  • 47
mody617
  • 1
  • 1

1 Answers1

-1

Please have a look at the withColumn() function. Can be used in conjunction with lit() https://spark.apache.org/docs/3.1.3/api/python/reference/api/pyspark.sql.DataFrame.withColumn.html

A new column can be added to an existing dataframe using this option.

Sample Example:

df.withColumn("Country", lit("USA")).show()
df.withColumn("Country", lit("USA")) \
  .withColumn("anotherColumn",lit("anotherValue")) \
  .show()

Example Source: Google led to https://azurelib.com/withcolumn-usage-in-databricks-with-examples/

Hope it helps...

rainingdistros
  • 450
  • 3
  • 11