pyspark- how to add a column to spark dataframe from a list

Question

I'm looking for a way to add a new column in a Spark DF from a list. In pandas approach it is very easy to deal with it but in spark it seems to be relatively difficult. Please find an examp

#pandas approach
list_example = [1,3,5,7,8]
df['new_column'] = list_example

#spark ?

Could you please help to resolve this tackle (the easiest possible solution)?

A new DF or if 5 rows you want 1..8 each assigned to a value out of the list? — thebluephantom, Jan 18 '22 at 18:01

score 2 · Accepted Answer · answered Jan 18 '22 at 18:05

2

You could try something like:

import pyspark.sql.functions as F

list_example = [1,3,5,7,8]
new_df = df.withColumn("new_column",  F.array( [F.lit(x) for x in list_example] ))
new_df.show()

answered Jan 18 '22 at 18:05

Marco_CH

3,243
8
25

4

Unclear if this is what is meant. But may be. – thebluephantom Jan 18 '22 at 18:09
I need simliar thing but it is list of string any idea how to handle this – user765443 Oct 14 '22 at 16:59

pyspark- how to add a column to spark dataframe from a list

1 Answers1