1

I'm using the Python package dtaidistance for fast DTW computations. As explained in the documentation, one can use the following code:

from dtaidistance import dtw
import numpy as np
series = np.matrix([
    [0.0, 0, 1, 2, 1, 0, 1, 0, 0],
    [0.0, 1, 2, 0, 0, 0, 0, 0, 0],
    [0.0, 0, 1, 2, 1, 0, 0, 0, 0]])
ds = dtw.distance_matrix_fast(series)

to compute DTW distance measures between sets of series. The time series I'm working with have a length of 3000. In total, I have roughly 3500 of those series for each of my data-sets.

Unfortunately, I'm not able to get any results from this function in a decent amount of time. On my machine (128 GB RAM, 32 CPU cores, 4 Nvidia GPUs) I had to abort the computations after a day. Surprisingly, I didn't even see any output from this function, even though I set the parameter "show_progress" (see source code) to true.

What am I doing wrong here? Thank you very much for your help.

Hagbard
  • 3,430
  • 5
  • 28
  • 64

1 Answers1

1

It turned out that I simply didn't build the package from source and thus was not able to access the faster C-based implementation.

The steps mentioned here solved the problem for me:

The library can also be compiled and/or installed directly from source.

Download the source from https://github.com/wannesm/dtaidistance
Compile the C extensions: python3 setup.py build_ext --inplace
Install into your site-package directory: python3 setup.py install
Hagbard
  • 3,430
  • 5
  • 28
  • 64
  • 1
    How much faster was the C-based implementation? – kevinbuchanjr Oct 29 '18 at 17:55
  • It was roughly 98 % faster. A thorough discussion of this can be found [here](https://stackoverflow.com/questions/52479046/different-results-and-performances-with-different-libraries). – Hagbard Oct 30 '18 at 08:31