how to get elements from a probability Column prediction in a pyspark model

Question

How can i to get the first element from the probability model results in a pyspark dataframe?

+------+--------------------+
|labelh|         probability|
+------+--------------------+
|     1|[0.72498853530094...|
|     1|[0.99989771872286...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|
|     1|[0.72498853530094...|    
+------+--------------------+

The probability column has two elements and is type "vector" variable. I need the first element from this column and paste it in the dataframe.

Thanks for help.

It works, thank you!!. Question: why do you need to define a udf to make this operation?, I've seen this function, but i dont know how this works!!! — Juan David, Jun 01 '18 at 12:51
`Vector` is implemented as an `UserDefinedType` and there is just no function which can operate on that. There was an idea of [VectorDisasembler](https://stackoverflow.com/a/41639236/8371915) but it has been rejected for now. — Alper t. Turker, Jun 01 '18 at 13:09

how to get elements from a probability Column prediction in a pyspark model

0 Answers0