Based on previous questions: 1, 2. Suppose I have the following dataframe:
df = spark.createDataFrame(
[(1, "a", 23.0), (3, "B", -23.0)],
("x1", "x2", "x3"))
And I want to add new column x4
but I have value in a list of Python instead to add to the new column e.g. x4_ls = [35.0, 32.0]
. Is there a best way to add new column to the Spark dataframe? (note that I use Spark 2.1)
Output should be something like:
## +---+---+-----+----+
## | x1| x2| x3| x4|
## +---+---+-----+----+
## | 1| a| 23.0|35.0|
## | 3| B|-23.0|32.0|
## +---+---+-----+----+
I can also transform my list to dataframe df_x4 = spark.createDataFrame([Row(**{'x4': x}) for x in x4_ls])
(but I don't how to concatenate dataframe together)