-1

I am not good at Python. I have a csv file, and I want to calculate cosine simularity or Euclidian Distance using them. But, I don't know how to use the csv file, so please let me know the documents or materials I can use.

park
  • 1
  • 2
  • There are a plenty of tutorials on the web for what you are looking for about csv like [this](https://realpython.com/python-csv/). For cosine similarity, you can check [sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_similarity.html) – Farzad Vertigo Sep 19 '19 at 03:39

1 Answers1

0

You can find euclidean distance using scipy.spatial.distance.euclidean or numpy.linalg.norm. Your problem statement can be broken down as follows:

Steps

  1. Load data from a CSV
  2. Calculate euclidean distance

Solution

You would need these libraries.

import numpy as np
import pandas as pd
from scipy.spatial.distance import euclidean

Step-1

Load CSV. We assume here that your CSV file has 2 or more columns and the target columns names are x and y.

df = pd.read_csv("filename.csv")
x = df['x']
y = df['y']

Step-2

I will show given you have two arrays of data x and y how you could calculate euclidean distance.

Make Data (if necessary)

# If you do not have data, make it
x = np.arange(10)
y = np.arange(10,20)

Calculate Euclidean Distance

# using scipy
scipy_d = euclidean(x,y)

# using numpy
numpy_d = np.linalg.norm(x-y)

References

I would encourage you to see the following.

  1. How can the Euclidean distance be calculated with NumPy?
  2. scipy.spatial.distance.euclidean
  3. pandas.read_csv
CypherX
  • 7,019
  • 3
  • 25
  • 37
  • @park I am glad the solution helped you out. Thank you for choosing it as the `accepted` answer. Could you please also **`vote-up`** this answer? – CypherX Sep 19 '19 at 03:02