-1

hello folks i want to know suppose i have a python dataframe and I want to calculate the cosine similarity between the 1st row of the dataframe with the remaining rows of the dataframe. can anyone please help

  • Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – jezrael Apr 22 '18 at 15:03

1 Answers1

0

Assume your dataframe have numeric values, here 'u' refer to first row of dataframe,

import pandas as pd
import numpy as np
u = df.iloc[0]
cos_sim_list = []
norm_u = np.linalg.norm(u)
for i in range(1, df.shape[0]):
    v = df.iloc[i]
    dot = np.dot(u, v)
    norm_v = np.linalg.norm(v)
    cos_sim = (dot/norm_u * norm_v)
    cos_sim_list.append(cos_sim)

cos_sim_list