I am not good at Python. I have a csv file, and I want to calculate cosine simularity or Euclidian Distance using them. But, I don't know how to use the csv file, so please let me know the documents or materials I can use.
Asked
Active
Viewed 43 times
-1
-
There are a plenty of tutorials on the web for what you are looking for about csv like [this](https://realpython.com/python-csv/). For cosine similarity, you can check [sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_similarity.html) – Farzad Vertigo Sep 19 '19 at 03:39
1 Answers
0
You can find euclidean distance using scipy.spatial.distance.euclidean
or numpy.linalg.norm
. Your problem statement can be broken down as follows:
Steps
- Load data from a CSV
- Calculate euclidean distance
Solution
You would need these libraries.
import numpy as np
import pandas as pd
from scipy.spatial.distance import euclidean
Step-1
Load CSV. We assume here that your CSV file has 2 or more columns and the target columns names are x
and y
.
df = pd.read_csv("filename.csv")
x = df['x']
y = df['y']
Step-2
I will show given you have two arrays of data x
and y
how you could calculate euclidean distance.
Make Data (if necessary)
# If you do not have data, make it
x = np.arange(10)
y = np.arange(10,20)
Calculate Euclidean Distance
# using scipy
scipy_d = euclidean(x,y)
# using numpy
numpy_d = np.linalg.norm(x-y)
References
I would encourage you to see the following.

CypherX
- 7,019
- 3
- 25
- 37
-
@park I am glad the solution helped you out. Thank you for choosing it as the `accepted` answer. Could you please also **`vote-up`** this answer? – CypherX Sep 19 '19 at 03:02