Append a new column list of values in PySpark

Question

+-----------+---+
|       Name|Age|
+-----------+---+
|Emma Larter| 34|
| Mia Junior| 59|
|Sophia Depp| 32|
|James Smith| 40|
+-----------+---+

I have a spark dataframe as above. I want to append a column to the dataframe using below list:

Salary = [35000, 24000, 55000, 40000]

How to do it in simple way using spark?

I can do this with pandas, but not spark.

https://stackoverflow.com/questions/26666919/add-column-in-dataframe-from-list — Eren Sakarya, May 09 '23 at 10:24

score 3 · Answer 1 · answered May 09 '23 at 12:41

Using Pyspark use zipWithIndex function to generate the index column and use it join.

Example:

from pyspark.sql.functions import *
from pyspark.sql.types import *
df= spark.createDataFrame([('Emma Larter',34),('Mia Junior',59),('Sophia',32),('James',40)],['Name','Age'])
df_ind = spark.createDataFrame(df.rdd.zipWithIndex(),['val','ind'])
Salary = [35000, 24000, 55000, 40000]
df_salary = spark.createDataFrame(spark.createDataFrame(Salary, IntegerType()).rdd.zipWithIndex(),['val1','ind'])
df_ind.join(df_salary,['ind']).select("val.*","val1.*").drop('ind').show()

#+-----------+---+-----+
#|       Name|Age|value|
#+-----------+---+-----+
#|Emma Larter| 34|35000|
#| Mia Junior| 59|24000|
#|     Sophia| 32|55000|
#|      James| 40|40000|
#+-----------+---+-----+

Kulasangar · Answer 2 · 2023-05-09T12:44:03.723

You could easily convert your pyspark dataframe into Pandas using toPandas() method and then append the new column to it

from pyspark.shell import spark

data = [("Alice", 25), ("Bob", 30), ("Charlie", 35)]
df = spark.createDataFrame(data, ["name", "age"])

new_pandas_df = df.toPandas()
new_pandas_df['gender'] = ['M', 'F', 'M']

print(new_pandas_df)

Output:

Please note i've used some test dataframe in my answer, but please change it according to yours. Also pandas has it own downfall since it does all the processing in memory, so consider this when you're doing it for a larger dataset.

Append a new column list of values in PySpark

2 Answers2