13

I want to calculate Dynamic Time Warping (DTW) distances in a dataframe. The result must be a new dataframe (a distance matrix) which includes the pairwise dtw distances among each row.

For Euclidean Distance I use the following code:

from scipy.spatial.distance import pdist, squareform
euclidean_dist = squareform(pdist(sample_dataframe,'euclidean'))

I need a similar code for DTW.

Thanks in advance.

Gonçalo Peres
  • 11,752
  • 3
  • 54
  • 83
venom
  • 2,563
  • 2
  • 11
  • 8
  • 1
    This question is not really suited for Stack Overflow. Maybe you should try to implement your own algorithm (maybe following [this](http://nipunbatra.github.io/2014/07/dtw/) blogpost) and post it for feedback on [Code Review](http://codereview.stackexchange.com/). – AlexV Dec 28 '15 at 22:57
  • http://stackoverflow.com/q/5695388/1461210 – ali_m Dec 29 '15 at 22:10

1 Answers1

5

There are various ways one might do that. I'll leave two options bellow.

In case one wants to know the difference between the euclidean distance and DTW, this is a good resource.


Option 1

Using fastdtw.

Install it with

pip install fastdtw

Then use it as following

import numpy as np from scipy.spatial.distance import euclidean

from fastdtw import fastdtw

x = np.array([[1,1], [2,2], [3,3], [4,4], [5,5]])
y = np.array([[2,2],
[3,3], [4,4]])
distance, path = fastdtw(x, y, dist=euclidean)
print(distance)

Option 2 (Source)

def dtw(s, t):
    n, m = len(s), len(t)
    dtw_matrix = np.zeros((n+1, m+1))
    for i in range(n+1):
        for j in range(m+1):
            dtw_matrix[i, j] = np.inf
    dtw_matrix[0, 0] = 0
    
    for i in range(1, n+1):
        for j in range(1, m+1):
            cost = abs(s[i-1] - t[j-1])
            # take last min from a square box
            last_min = np.min([dtw_matrix[i-1, j], dtw_matrix[i, j-1], dtw_matrix[i-1, j-1]])
            dtw_matrix[i, j] = cost + last_min
    return dtw_matrix 

It works like the following

enter image description here

Gonçalo Peres
  • 11,752
  • 3
  • 54
  • 83