0

I'm using pyspark 3.1.2.

the schemas of 'Vector1' and 'Vector2' are both VectorUDT

  +----------+---------+
  |  Vector1 | Vector2 |
  +----------+---------+
  |[10.0,8.0]|[7.0,6.0]|
  | [3.0,5.0]|[9.0,2.0]|
  | [1.0,3.0]|[4.0,7.0]|
  | [1.0,5.0]|[9.0,3.0]|
  | [2.0,8.0]|[2.0,0.0]|
  | [8.0,7.0]|[3.0,6.0]|
  +----------+---------+

How to calculate angles between Vector1 and Vector2?

I tried to:

from pyspark.ml.linalg import Vectors
angle_udf = F.udf(lambda x,y : x.dot(y) / (Vectors.norm(x,p=2) * Vectors.norm(y,p=2)),FloatType())
Vector = Vector.withColumn("Angle", angle_udf("Vector1","Vector2"))

But I didn't get the results I wanted.

Drizzle
  • 95
  • 1
  • 6
  • Is [This](https://stackoverflow.com/questions/14066933/direct-way-of-computing-clockwise-angle-between-2-vectors) what you need ? – Gresta Jan 05 '22 at 01:27
  • `angle = acos( dot(v_1,v_2)/(norm(v_1)*norm(v_2)) )` – John Alexiou Jan 05 '22 at 01:37
  • How to calculate it in dataframe? – Drizzle Jan 05 '22 at 01:39
  • when looking at the official doc, vectors are used for ML but there are not "angle" methods related to this kind of object. but if you considere your vector as simply a array of coordinates, there are some basic sql function that you can use to compute angles (with acos or norm functions for example) – Steven Jan 05 '22 at 09:16

1 Answers1

0

To get the angle inbetween the two vectors I would suggest using the atan2 builtin function with the follwoing formula:

angle = atan2(v2.y, v2.x) - atan2(v1.y, v1.x)

The PySpark implementation would look then like this assuming that the columns in your dataframe are labeled with x1, y1, x2, y2:

df = df.withColumn('angle', f.atan2(f.col('y2'), f.col('x2')) - f.atan2(f.col('y1'), f.col('x1')))
elyptikus
  • 936
  • 8
  • 24