How to divide a PySpark list into different columns?

Question

I have a Data Frame with one column in each row of this column there is a list with 2 numbers. The first number is an integer and the second number is double. For example row 1 is [12, 14.5] and row 2 is [21, 27.3]. How can I divide this list into 2 columns so I will have the first number of list in one column and second number of the list in another column?

Possible duplicate of [How to extract an element from a array in pyspark](https://stackoverflow.com/questions/45254928/how-to-extract-an-element-from-a-array-in-pyspark) and [Querying Spark SQL DataFrame with complex types](https://stackoverflow.com/questions/28332494/querying-spark-sql-dataframe-with-complex-types) — pault, Jan 24 '20 at 21:43
The first one works only for integers. The problem is that the second number in my list is not an integer. If you run the code you will realize it. — OMID Davami, Jan 24 '20 at 23:52

score 0 · Answer 1 · edited Jan 25 '20 at 05:47

0

The thing that you need to add or update in your code :

df = df.select(col('vals.val1').alias("val1"), col('vals.val2').alias("val2"))

edited Jan 25 '20 at 05:47

Anish B.

9,111
3
21
41

answered Jan 24 '20 at 20:51

OMID Davami

69
1
11

This implies that `vals` is a `StructType`, which is not what you said in your question statement. – pault Jan 24 '20 at 21:45
While this code snippet may solve the problem, [including an explanation](https://meta.stackoverflow.com/q/392712/2648551) really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion. Please also try not to crowd your code with explanatory comments, as this reduces the readability of both the code and the explanations! – colidyre Jan 25 '20 at 01:21

How to divide a PySpark list into different columns?

1 Answers1