pyspark - select new column with hardcoded value

Question

I am trying to query a dataframe and add a column with a set value but I'm not sure how to get it to work. I know how it works in SQL but I could use help converting it to pyspark.

PL/SQL Example: SELECT 1 AS column1 ,2 AS column2 FROM dual;

enter image description here

pyspark: empDF.select("name", col("").alias("nullColumn")).display()

enter image description here

score -1 · Answer 1 · answered Feb 10 '23 at 02:36

Please have a look at the withColumn() function. Can be used in conjunction with lit() https://spark.apache.org/docs/3.1.3/api/python/reference/api/pyspark.sql.DataFrame.withColumn.html

A new column can be added to an existing dataframe using this option.

Sample Example:

df.withColumn("Country", lit("USA")).show()
df.withColumn("Country", lit("USA")) \
  .withColumn("anotherColumn",lit("anotherValue")) \
  .show()

Example Source: Google led to https://azurelib.com/withcolumn-usage-in-databricks-with-examples/

Hope it helps...

pyspark - select new column with hardcoded value

1 Answers1