17

I am looking for an Earth Mover's distance(or Fast EMD) implementation in python. Any clues on where to find it, I have looked enough on the web. I want to use it in an image retrieval project that I am doing. Thanks.

EDIT: I found a very nice solution using the pulp libararies. This page also has the instruction required to set it up.

Frédéric Hamidi
  • 258,201
  • 41
  • 486
  • 479
vishalv2050
  • 813
  • 1
  • 10
  • 18
  • 1
    Perhaps if you included a link to a definition of this term, you would save your potential answerers a trip to Google. – PaulMcG Feb 24 '11 at 06:08
  • 1
    If it exists in C somewhere you can use it from Python. – KyleWpppd Feb 24 '11 at 06:34
  • 2
    Do you think you could try to elaborate on the solution you found? I'm having the same problem :) – Will Feb 27 '14 at 19:48
  • @Will Were you successful in installing the pulp library from the link that I have mentioned in EDIT ? – vishalv2050 Feb 28 '14 at 11:26
  • 2
    Earth mover's distance (**EMD**) is also known as **Wasserstein metric** You can get the Python implementation for that from `scipy.stats`: [https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wasserstein_distance.html](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wasserstein_distance.html) – dvitsios Jun 04 '19 at 10:50

2 Answers2

28

There is an excellent implementation in OpenCv for Python. The name of the function is CalcEMD2 and a simple code to compare histograms of two images would look like this:

#Import OpenCv library
from cv2 import *

### HISTOGRAM FUNCTION #########################################################
def calcHistogram(src):
    # Convert to HSV
    hsv = cv.CreateImage(cv.GetSize(src), 8, 3)
    cv.CvtColor(src, hsv, cv.CV_BGR2HSV)

    # Extract the H and S planes
    size = cv.GetSize(src)
    h_plane = cv.CreateMat(size[1], size[0], cv.CV_8UC1)
    s_plane = cv.CreateMat(size[1], size[0], cv.CV_8UC1)
    cv.Split(hsv, h_plane, s_plane, None, None)
    planes = [h_plane, s_plane]

    #Define numer of bins
    h_bins = 30
    s_bins = 32

    #Define histogram size
    hist_size = [h_bins, s_bins]

    # hue varies from 0 (~0 deg red) to 180 (~360 deg red again */
    h_ranges = [0, 180]

    # saturation varies from 0 (black-gray-white) to 255 (pure spectrum color)
    s_ranges = [0, 255]

    ranges = [h_ranges, s_ranges]

    #Create histogram
    hist = cv.CreateHist([h_bins, s_bins], cv.CV_HIST_ARRAY, ranges, 1)

    #Calc histogram
    cv.CalcHist([cv.GetImage(i) for i in planes], hist)

    cv.NormalizeHist(hist, 1.0)

    #Return histogram
    return hist

### EARTH MOVERS ############################################################
def calcEM(hist1,hist2,h_bins,s_bins):

    #Define number of rows
    numRows = h_bins*s_bins

    sig1 = cv.CreateMat(numRows, 3, cv.CV_32FC1)
    sig2 = cv.CreateMat(numRows, 3, cv.CV_32FC1)    

    for h in range(h_bins):
        for s in range(s_bins): 
            bin_val = cv.QueryHistValue_2D(hist1, h, s)
            cv.Set2D(sig1, h*s_bins+s, 0, cv.Scalar(bin_val))
            cv.Set2D(sig1, h*s_bins+s, 1, cv.Scalar(h))
            cv.Set2D(sig1, h*s_bins+s, 2, cv.Scalar(s))

            bin_val = cv.QueryHistValue_2D(hist2, h, s)
            cv.Set2D(sig2, h*s_bins+s, 0, cv.Scalar(bin_val))
            cv.Set2D(sig2, h*s_bins+s, 1, cv.Scalar(h))
            cv.Set2D(sig2, h*s_bins+s, 2, cv.Scalar(s))

    #This is the important line were the OpenCV EM algorithm is called
    return cv.CalcEMD2(sig1,sig2,cv.CV_DIST_L2)

### MAIN ########################################################################
if __name__=="__main__":
    #Load image 1
    src1 = cv.LoadImage("image1.jpg")

    #Load image 1
    src2 = cv.LoadImage("image2.jpg")

    # Get histograms
    histSrc1= calcHistogram(src1)
    histSrc2= calcHistogram(src2)

    # Compare histograms using earth mover's
    histComp = calcEM(histSrc1,histSrc2,30,32)

    #Print solution
    print(histComp)

I tested a code very similar to the previous code with Python 2.7 and Python(x,y). If you want to learn more about Earth Mover's and you want to see an implementation using OpenCV and C++, you can read "Chapter 7: Histograms an Matching" of the book "Learning OpenCV" by Gary Bradski & Adrain Kaebler.

Andrew Draganov
  • 676
  • 6
  • 18
Jaime Ivan Cervantes
  • 3,579
  • 1
  • 40
  • 38
2

here is the python code for calculating EARTH MOVERS DISTANCE between two 1D distributions of equal length

def emd (a,b):
earth = 0
earth1 = 0
diff = 0
s= len(a)
su = []
diff_array = []
for i in range (0,s):
    diff = a[i]-b[i]
    diff_array.append(diff)
    diff = 0
for j in range (0,s):
    earth = (earth + diff_array[j])
    earth1= abs(earth)
    su.append(earth1)
emd_output = sum(su)/(s-1)
print(emd_output)
Vikas
  • 41
  • 2