dtaidistance.dtw

Dynamic Time Warping (DTW)

author:

Wannes Meert

copyright:

Copyright 2017-2024 KU Leuven, DTAI Research Group.

license:

Apache License, Version 2.0, see LICENSE for details.

dtaidistance.dtw.best_path(paths, row=None, col=None, use_max=False)

Compute the optimal path from the nxm warping paths matrix.

Parameters:
  • paths – Warping paths matrix

  • row – If given, start from this row (instead of lower-right corner)

  • col – If given, start from this column (instead of lower-right corner)

  • use_max – Find maximal path instead of minimal path

Returns:

Array of (row, col) representing the best path

dtaidistance.dtw.best_path2(paths)

Compute the optimal path from the nxm warping paths matrix.

dtaidistance.dtw.distance(s1, s2, only_ub=False, **kwargs)

Dynamic Time Warping.

This function keeps a compact matrix, not the full warping paths matrix.

Uses dynamic programming to compute:

wps[i, j] = (s1[i]-s2[j])**2 + min(
                wps[i-1, j  ] + penalty,  // vertical   / insertion / expansion
                wps[i  , j-1] + penalty,  // horizontal / deletion  / compression
                wps[i-1, j-1])            // diagonal   / match
dtw = sqrt(wps[-1, -1])
Parameters:
  • s1 – First sequence

  • s2 – Second sequence

  • only_ub – Only compute the upper bound (Euclidean).

  • kwargsDTWSettings arguments

Returns: DTW distance

dtaidistance.dtw.distance_fast(s1, s2, only_ub=False, **kwargs)

Same as distance() but with different defaults to choose the fast C-based version of the implementation (use_c = True).

Note: the series are expected to be arrays of the type double. Thus numpy.array([1,2,3], dtype=numpy.double) or array.array('d', [1,2,3])

dtaidistance.dtw.distance_matrix(s, block=None, compact=False, parallel=False, use_mp=False, show_progress=False, only_triu=False, **kwargs)

Distance matrix for all sequences in s.

Parameters:
  • s – Iterable of series

  • block – Only compute block in matrix. Expects tuple with begin and end, e.g. ((0,10),(20,25)) will only compare rows 0:10 with rows 20:25.

  • compact – Return the distance matrix as an array representing the upper triangular matrix.

  • parallel – Use parallel operations

  • use_mp – Force use Multiprocessing for parallel operations (not OpenMP)

  • show_progress – Show progress using the tqdm library. This is only supported for the pure Python version (thus not the C-based implementations).

  • only_triu – Only compute upper traingular matrix of warping paths. This is useful if s1 and s2 are the same series and the matrix would be mirrored around the diagonal.

  • kwargs – See arguments for DTWSettings

Returns:

The distance matrix or the condensed distance matrix if the compact argument is true

dtaidistance.dtw.distance_matrix_fast(s, max_dist=None, use_pruning=False, max_length_diff=None, window=None, max_step=None, penalty=None, psi=None, block=None, compact=False, parallel=True, use_mp=False, only_triu=False, inner_dist='squared euclidean')

Same as distance_matrix() but with different defaults to choose the fast parallized C version (use_c = True and parallel = True).

This method uses the C-compiled version of the DTW algorithm and uses parallelization. By default, this is the OMP C parallelization. If the OMP functionality is not available the parallelization is changed to use Python’s multiprocessing library.

dtaidistance.dtw.distances_array_to_matrix(dists, nb_series, block=None, only_triu=False)

Transform a condensed distances array to a full matrix representation.

The upper triangular matrix will contain all the distances.

dtaidistance.dtw.lb_keogh(s1, s2, **kwargs)

Lowerbound LB_KEOGH

dtaidistance.dtw.ub_euclidean(s1, s2, inner_dist='squared euclidean')

See dtaidistance.ed.euclidean_distance()

dtaidistance.dtw.warp(from_s, to_s, path=None, **kwargs)

Warp a function to optimally match a second function.

Parameters:
  • from_s – First sequence

  • to_s – Second sequence

  • path – (Optional) Path to use wrap the ‘from_s’ sequence to the ‘to_s’ sequence If provided, this function will use it. If not provided, this function will calculate it using the warping_path function

  • kwargs – Same options as warping_paths().

dtaidistance.dtw.warping_amount(path)

Returns the number of compressions and expansions performed to obtain the best path. Can be used as a metric for the amount of warping.

Parameters:

path – path to be tested

:returns number of compressions or expansions

dtaidistance.dtw.warping_path(from_s, to_s, include_distance=False, use_ndim=False, **kwargs)

Compute warping path between two sequences.

dtaidistance.dtw.warping_path_fast(from_s, to_s, include_distance=False, **kwargs)

Compute warping path between two sequences.

dtaidistance.dtw.warping_path_penalty(s1, s2, penalty_post=0, **kwargs)

Dynamic Time Warping with an alternative penalty.

This function supports two different penalties. The traditional DTW penalty penalty is used in the matrix during calculation of the warping path (see distance()).

The second penalty penalty_post measures the amount of warping. This penalty doesn’t affect the warping path and is added to the DTW distance after the warping for every compression or expansion.

Same options as warping_paths()

Parameters:
  • s1 – First sequence

  • s2 – Second sequence

  • penalty_post – Penalty to be added after path calculation, for compression/extension

:returns DTW distance, the best path, DTW distance between 2 path elements, DTW matrix

dtaidistance.dtw.warping_path_prob(from_s, to_s, avg, include_distance=False, use_c=True, **kwargs)

Compute warping path between two sequences.

dtaidistance.dtw.warping_paths(s1, s2, psi_neg=True, **kwargs)

Dynamic Time Warping.

The full matrix of all warping paths (or accumulated cost matrix) is built.

Parameters:
  • s1 – First sequence

  • s2 – Second sequence

  • psi_neg – Replace values that should be skipped because of psi-relaxation with -1.

  • kwargs – See arguments for DTWSettings

Returns:

(DTW distance, DTW matrix)

dtaidistance.dtw.warping_paths_affinity(s1, s2, window=None, only_triu=False, penalty=None, psi=None, psi_neg=True, gamma=1, tau=0, delta=0, delta_factor=1, use_c=False)

Dynamic Time Warping paths using an affinity/similarity matrix instead of a distance matrix.

The full matrix of all warping paths (or accumulated cost matrix) is built.

Parameters:
  • s1 – First sequence

  • s2 – Second sequence

  • window – see distance()

  • only_triu – Only compute upper traingular matrix of warping paths. This is useful if s1 and s2 are the same series and the matrix would be mirrored around the diagonal.

  • penalty – see distance()

  • psi – see distance()

  • psi_neg – Replace values that should be skipped because of psi-relaxation with -1.

  • gamma

  • tau

  • delta

  • delta_factor

  • use_c

Returns:

(DTW distance, DTW matrix)

dtaidistance.dtw.warping_paths_affinity_fast(s1, s2, window=None, only_triu=False, penalty=None, psi=None, psi_neg=True, gamma=1, tau=0, delta=0, delta_factor=1, compact=False, use_ndim=False)

Fast C version of warping_paths().

Parameters:
  • s1

  • s2

  • window

  • only_triu

  • penalty

  • psi

  • psi_neg

  • gamma

  • tau

  • delta

  • delta_factor

  • compact – Return a compact warping paths matrix. Size is ((l1 + 1), min(l2 + 1, abs(l1 - l2) + 2*window + 1)). This option is meant for internal use. For more details, see the C code.

  • use_ndim

dtaidistance.dtw.warping_paths_fast(s1, s2, psi_neg=True, compact=False, **kwargs)

Fast C version of warping_paths().

Parameters:
  • s1 – See warping_paths()

  • s2 – See warping_paths()

  • psi_neg – See warping_paths()

  • compact – Return a compact warping paths matrix. Size is ((l1 + 1), min(l2 + 1, abs(l1 - l2) + 2*window + 1)). This option is meant for internal use. For more details, see the C code.

  • kwargs – See warping_paths()

class dtaidistance.dtw.DTWSettings(window=None, use_pruning=False, max_dist=None, max_step=None, max_length_diff=None, penalty=None, psi=None, inner_dist='squared euclidean', use_ndim=False, use_c=False)

Settings for Dynamic Time Warping distance methods.

Parameters:
  • window – Only allow for maximal shifts from the two diagonals smaller than this number. The maximally allowed warping, thus difference between indices i in series 1 and j in series 2, is thus |i-j| < 2*window + |len(s1) - len(s2)|. It includes the diagonal, meaning that Euclidean distance is obtained by setting window=1. If the two series are of equal length, this means that the band you see appearing on the cumulative cost matrix is of width 2*window-1. In other definitions of DTW this is referred to as the window.

  • max_dist – Stop if the returned values will be larger than this value

  • max_step – Do not allow steps larger than this value. If the difference between two values in the two series is larger than this, thus if |s1[i]-s2[j]| > max_step, replace that value with infinity.

  • max_length_diff – Return infinity if length of two series is larger

  • penalty – Penalty to add if compression or expansion is applied

  • psi – Psi relaxation parameter (ignore start and end of matching). If psi is a single integer, it is used for both start and end relaxations of both series. If psi is a 4-tuple, it is used as the psi-relaxation for (begin series1, end series1, begin series2, end series2). Useful for cyclical series.

  • use_pruning – Prune values based on Euclidean distance. This is the same as passing ub_euclidean() to max_dist

  • inner_dist – Distance between two points in the time series. One of ‘squared euclidean’ (default), ‘euclidean’. When using the pure Python implementation (thus use_c=False) then the argument can also be an object that has as callable arguments ‘inner_dist’, ‘result’, and ‘inner_val’. The ‘inner_dist’ function computes the distance between two points (e.g., squared euclidean) and ‘result’ is the function to apply to the final distance (e.g., sqrt when using squared euclidean). You can also inherit from the ‘innerdistance.CustomInnerDist’ class.

  • use_ndim – Use n-dimensional (aka multivariate) series instead of 1-dimensional series.

  • use_c – Use the C variant if available.