hello folks i want to know suppose i have a python dataframe and I want to calculate the cosine similarity between the 1st row of the dataframe with the remaining rows of the dataframe. can anyone please help
Asked
Active
Viewed 602 times
-1
-
Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – jezrael Apr 22 '18 at 15:03
1 Answers
0
Assume your dataframe have numeric values, here 'u' refer to first row of dataframe,
import pandas as pd
import numpy as np
u = df.iloc[0]
cos_sim_list = []
norm_u = np.linalg.norm(u)
for i in range(1, df.shape[0]):
v = df.iloc[i]
dot = np.dot(u, v)
norm_v = np.linalg.norm(v)
cos_sim = (dot/norm_u * norm_v)
cos_sim_list.append(cos_sim)
cos_sim_list

Muhammad Umar Amanat
- 869
- 9
- 18