sparkSQL dataframe how to split to some column

Question

as follow

+----------------------------------------+
|probability                             |
+----------------------------------------+
|[0.42789998388333284,0.5721000161166672]|
|[0.42979424193820465,0.5702057580617953]|
|[0.4288468523208701,0.57115314767913]   |
+----------------------------------------+

the "probability" type is

org.apache.spark.sql.DataFrame = [probability: vector]

how split the probability into 2 column

thanks

score 0 · Answer 1 · answered Dec 19 '18 at 20:27

0

you can do it like this using Dataset API :

import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
import org.apache.spark.ml.linalg.Vector

df
  .as[Vector](ExpressionEncoder(): Encoder[Vector])
  .map(v => (v(0),v(1)))
  .toDF("prob1","prob2")
  .show()

answered Dec 19 '18 at 20:27

Raphael Roth

26,751
15
88
145

thanks,but how to add 2 column to that – xuguozheng Dec 20 '18 at 01:54

sparkSQL dataframe how to split to some column

1 Answers1