Questions tagged [earth-movers-distance]

EMD or Wasserstein metric is a measure of the distance between two probability distributions over a region.

18 questions
13
votes
1 answer

Python Earth Mover Distance of 2D arrays

I would like to compute the Earth Mover Distance between two 2D arrays (these are not images). Right now I go through two libraries: scipy (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wasserstein_distance.html) and pyemd…
Luca
  • 324
  • 2
  • 15
4
votes
2 answers

Reference for wasserstein distance function in python

We are trying to calculate the distance between two discrete 1-d distributions. Our purpose is to compute a distance function that follows the intuition of optimal transport: Our distributions are masses at "points", i.e vectors, with importance to…
3
votes
1 answer

Calculating EMD for 3D point-clouds is very SLOW

I wanted to calculate the distance between two 3D point clouds with at least 2000 points using Earth Mover's Distance with the following code, however, it is too slow and does not work properly. So, is there any way to calculate it for approximate…
3
votes
0 answers

t-SNE using earth mover distance metric

I am trying to use t-SNE with Wasserstrain distance instead of Euclidean. Here is part of my code: from sklearn.manifold import TSNE from scipy.stats import wasserstein_distance tsne = TSNE(n_components=2,perplexity=40, n_iter=1000,…
2
votes
0 answers

How to customize a XGBoost objective function for ordinal classification problem?

I am training a model to do the classification on an ordinal response variable with 10 levels. I have studied a paper called "Squared Earth Mover’s Distance-based Loss for Training Deep Neural Networks" (https://arxiv.org/pdf/1611.05916.pdf) and I…
2
votes
0 answers

wasserstein distance for multiple histograms

I'm trying to calculate the distance matrix between histograms. I can only find the code for calculating the distance between 2 histograms and my data have more than 10. My data is a CSV file and histogram comes in columns that add up to 100. Which…
1
vote
0 answers

Can scipy.stats.wasserstein_distance be used with empirical distributions of different (unequal) sizes?

For the evaluation of a system, I have measured a metric of interest across three distinct configurations (settings). I thus have three arrays of observations, observations_setting_1, observations_setting_2, and observations_setting_3, for example,…
simon
  • 11
  • 3
1
vote
0 answers

Wasserstein distance between two distributions python

I have distributions of some data pre and post an event occurrence. I want to find the distance between these two distributions. To put it differently, how much would I need to scale pre-event distribution to come close to the post-event…
1
vote
1 answer

scipy.stats.wasserstein_distance implementation

I am trying to understand the implementation that is used in scipy.stats.wasserstein_distance for p=1 and no weights, with u_values, v_values the two 1-D distributions, the code comes down to u_sorter = np.argsort(u_values) (1) v_sorter =…
kam
  • 11
  • 1
0
votes
1 answer

transport unbalanced does not work when output="all"

I want to compute the transport costs to transport one distribution of mass to another (in the fashion of earth mover distance). I want to use an unbalanced transport. I use the transport library and it works when I want only the distance, however…
Noskario
  • 378
  • 1
  • 9
0
votes
1 answer

What's the optimal solution for 1D optimal transport

Assume I want to move n goods to n warehouses. I have a n x n cost matrix M, where Mij denotes the cost of transporting jth good to the warehouse. How do I find the transporting plan that minimizes the total costs? I know there are many general…
Alex Fu
  • 113
  • 1
  • 8
0
votes
0 answers

Using Earth Mover's Distance for multi-dimensional vectors with unequal length

I am working on a project which involves calculating sentence similarity. Context vectors for each token in a sentence are generated using Hugging Face's BERT. The code below returns all the token vectors in a sentence. sentence= "Hello this is a…
0
votes
1 answer

Calculate similarity in spatial utilization between .tif rasters using Earth Mover's Distance (EMD)

I am analyzing animal tracking data within an acoustic receiver array using dynamic Brownian bridge movement models. The dataset contains an animal identifier and every detection on a receiver has a timestamp and a lat/lon coordinates (in decimal…
vheim
  • 33
  • 6
0
votes
1 answer

Exact Earth Mover's Distance (NOT Mallows Distance) Python Code

Is there any python library for computing EMD between two signatures? There are multiple options to compute EMD between two distributions (e.g. pyemd). But I didn't find any implementation for the exact EMD value. For example, consider Signature_1 =…
Unknown
  • 53
  • 1
  • 5
0
votes
0 answers

Spaces within python function input arguments because it's a cython function?

I know nothing about cython, only python. The function emd_c below (from the pot optimal transport package) has a header whose argument formats I've never seen before, or which I don't think would work under stand-alone python/numpy, but perhaps…
develarist
  • 1,224
  • 1
  • 13
  • 34
1
2