I'm using pyspark 3.1.2.
the schemas of 'Vector1'
and 'Vector2'
are both VectorUDT
+----------+---------+
| Vector1 | Vector2 |
+----------+---------+
|[10.0,8.0]|[7.0,6.0]|
| [3.0,5.0]|[9.0,2.0]|
| [1.0,3.0]|[4.0,7.0]|
| [1.0,5.0]|[9.0,3.0]|
| [2.0,8.0]|[2.0,0.0]|
| [8.0,7.0]|[3.0,6.0]|
+----------+---------+
How to calculate angles between Vector1
and Vector2
?
I tried to:
from pyspark.ml.linalg import Vectors
angle_udf = F.udf(lambda x,y : x.dot(y) / (Vectors.norm(x,p=2) * Vectors.norm(y,p=2)),FloatType())
Vector = Vector.withColumn("Angle", angle_udf("Vector1","Vector2"))
But I didn't get the results I wanted.