dtaidistance.dtw

Dynamic Time Warping (DTW)

author:: Wannes Meert
copyright:: Copyright 2017-2024 KU Leuven, DTAI Research Group.
license:: Apache License, Version 2.0, see LICENSE for details.

dtaidistance.dtw.best_path(paths, row=None, col=None, use_max=False)

Compute the optimal path from the nxm warping paths matrix.

Parameters:

paths – Warping paths matrix
row – If given, start from this row (instead of lower-right corner)
col – If given, start from this column (instead of lower-right corner)
use_max – Find maximal path instead of minimal path

Returns:

Array of (row, col) representing the best path

dtaidistance.dtw.best_path2(paths): Compute the optimal path from the nxm warping paths matrix.

dtaidistance.dtw.distance(s1, s2, only_ub=False, **kwargs)

Dynamic Time Warping.

This function keeps a compact matrix, not the full warping paths matrix.

Uses dynamic programming to compute:

wps[i, j] = (s1[i]-s2[j])**2 + min(
                wps[i-1, j  ] + penalty,  // vertical   / insertion / expansion
                wps[i  , j-1] + penalty,  // horizontal / deletion  / compression
                wps[i-1, j-1])            // diagonal   / match
dtw = sqrt(wps[-1, -1])

Parameters:

s1 – First sequence
s2 – Second sequence
only_ub – Only compute the upper bound (Euclidean).
kwargs – DTWSettings arguments

Returns: DTW distance

dtaidistance.dtw.distance_fast(s1, s2, only_ub=False, **kwargs)

Same as distance() but with different defaults to choose the fast C-based version of the implementation (use_c = True).

Note: the series are expected to be arrays of the type double. Thus numpy.array([1,2,3], dtype=numpy.double) or array.array('d', [1,2,3])

dtaidistance.dtw.distance_matrix(s, block=None, compact=False, parallel=False, use_mp=False, show_progress=False, only_triu=False, **kwargs)

Distance matrix for all sequences in s.

Parameters:

s – Iterable of series
block – Only compute block in matrix. Expects tuple with begin and end, e.g. ((0,10),(20,25)) will only compare rows 0:10 with rows 20:25.
compact – Return the distance matrix as an array representing the upper triangular matrix.
parallel – Use parallel operations
use_mp – Force use Multiprocessing for parallel operations (not OpenMP)
show_progress – Show progress using the tqdm library. This is only supported for the pure Python version (thus not the C-based implementations).
only_triu – Only compute upper traingular matrix of warping paths. This is useful if s1 and s2 are the same series and the matrix would be mirrored around the diagonal.
kwargs – See arguments for DTWSettings

Returns:

The distance matrix or the condensed distance matrix if the compact argument is true

dtaidistance.dtw.distance_matrix_fast(s, max_dist=None, use_pruning=False, max_length_diff=None, window=None, max_step=None, penalty=None, psi=None, block=None, compact=False, parallel=True, use_mp=False, only_triu=False, inner_dist='squared euclidean')

Same as distance_matrix() but with different defaults to choose the fast parallized C version (use_c = True and parallel = True).

This method uses the C-compiled version of the DTW algorithm and uses parallelization. By default, this is the OMP C parallelization. If the OMP functionality is not available the parallelization is changed to use Python’s multiprocessing library.

dtaidistance.dtw.distances_array_to_matrix(dists, nb_series, block=None, only_triu=False)

Transform a condensed distances array to a full matrix representation.

The upper triangular matrix will contain all the distances.

dtaidistance.dtw.lb_keogh(s1, s2, **kwargs): Lowerbound LB_KEOGH

dtaidistance.dtw.ub_euclidean(s1, s2, inner_dist='squared euclidean'): See dtaidistance.ed.euclidean_distance()

dtaidistance.dtw.warp(from_s, to_s, path=None, **kwargs)

Warp a function to optimally match a second function.

Parameters:

from_s – First sequence
to_s – Second sequence
path – (Optional) Path to use wrap the ‘from_s’ sequence to the ‘to_s’ sequence If provided, this function will use it. If not provided, this function will calculate it using the warping_path function
kwargs – Same options as warping_paths().

dtaidistance.dtw.warping_amount(path)

Returns the number of compressions and expansions performed to obtain the best path. Can be used as a metric for the amount of warping.

Parameters:: path – path to be tested

:returns number of compressions or expansions

dtaidistance.dtw.warping_path(from_s, to_s, include_distance=False, use_ndim=False, **kwargs): Compute warping path between two sequences.

dtaidistance.dtw.warping_path_fast(from_s, to_s, include_distance=False, **kwargs): Compute warping path between two sequences.

dtaidistance.dtw.warping_path_penalty(s1, s2, penalty_post=0, **kwargs)

Dynamic Time Warping with an alternative penalty.

This function supports two different penalties. The traditional DTW penalty penalty is used in the matrix during calculation of the warping path (see distance()).

The second penalty penalty_post measures the amount of warping. This penalty doesn’t affect the warping path and is added to the DTW distance after the warping for every compression or expansion.

Same options as warping_paths()

Parameters:

s1 – First sequence
s2 – Second sequence
penalty_post – Penalty to be added after path calculation, for compression/extension

:returns DTW distance, the best path, DTW distance between 2 path elements, DTW matrix

dtaidistance.dtw.warping_path_prob(from_s, to_s, avg, include_distance=False, use_c=True, **kwargs): Compute warping path between two sequences.

dtaidistance.dtw.warping_paths(s1, s2, psi_neg=True, **kwargs)

Dynamic Time Warping.

The full matrix of all warping paths (or accumulated cost matrix) is built.

Parameters:

s1 – First sequence
s2 – Second sequence
psi_neg – Replace values that should be skipped because of psi-relaxation with -1.
kwargs – See arguments for DTWSettings

Returns:

(DTW distance, DTW matrix)

dtaidistance.dtw.warping_paths_affinity(s1, s2, window=None, only_triu=False, penalty=None, psi=None, psi_neg=True, gamma=1, tau=0, delta=0, delta_factor=1, use_c=False)

Dynamic Time Warping paths using an affinity/similarity matrix instead of a distance matrix.

The full matrix of all warping paths (or accumulated cost matrix) is built.

Parameters:

s1 – First sequence
s2 – Second sequence
window – see distance()
only_triu – Only compute upper traingular matrix of warping paths. This is useful if s1 and s2 are the same series and the matrix would be mirrored around the diagonal.
penalty – see distance()
psi – see distance()
psi_neg – Replace values that should be skipped because of psi-relaxation with -1.
gamma
tau
delta
delta_factor
use_c

Returns:

(DTW distance, DTW matrix)

dtaidistance.dtw.warping_paths_affinity_fast(s1, s2, window=None, only_triu=False, penalty=None, psi=None, psi_neg=True, gamma=1, tau=0, delta=0, delta_factor=1, compact=False, use_ndim=False)

Fast C version of warping_paths().

Parameters:

s1
s2
window
only_triu
penalty
psi
psi_neg
gamma
tau
delta
delta_factor
compact – Return a compact warping paths matrix. Size is ((l1 + 1), min(l2 + 1, abs(l1 - l2) + 2*window + 1)). This option is meant for internal use. For more details, see the C code.
use_ndim

dtaidistance.dtw.warping_paths_fast(s1, s2, psi_neg=True, compact=False, **kwargs)

Fast C version of warping_paths().

Parameters:

s1 – See warping_paths()
s2 – See warping_paths()
psi_neg – See warping_paths()
compact – Return a compact warping paths matrix. Size is ((l1 + 1), min(l2 + 1, abs(l1 - l2) + 2*window + 1)). This option is meant for internal use. For more details, see the C code.
kwargs – See warping_paths()

class dtaidistance.dtw.DTWSettings(window=None, use_pruning=False, max_dist=None, max_step=None, max_length_diff=None, penalty=None, psi=None, inner_dist='squared euclidean', use_ndim=False, use_c=False)

Settings for Dynamic Time Warping distance methods.

Parameters:

window – Only allow for maximal shifts from the two diagonals smaller than this number. The maximally allowed warping, thus difference between indices i in series 1 and j in series 2, is thus |i-j| < 2*window + |len(s1) - len(s2)|. It includes the diagonal, meaning that Euclidean distance is obtained by setting window=1. If the two series are of equal length, this means that the band you see appearing on the cumulative cost matrix is of width 2*window-1. In other definitions of DTW this is referred to as the window.
max_dist – Stop if the returned values will be larger than this value
max_step – Do not allow steps larger than this value. If the difference between two values in the two series is larger than this, thus if |s1[i]-s2[j]| > max_step, replace that value with infinity.
max_length_diff – Return infinity if length of two series is larger
penalty – Penalty to add if compression or expansion is applied
psi – Psi relaxation parameter (ignore start and end of matching). If psi is a single integer, it is used for both start and end relaxations of both series. If psi is a 4-tuple, it is used as the psi-relaxation for (begin series1, end series1, begin series2, end series2). Useful for cyclical series.
use_pruning – Prune values based on Euclidean distance. This is the same as passing ub_euclidean() to max_dist
inner_dist – Distance between two points in the time series. One of ‘squared euclidean’ (default), ‘euclidean’. When using the pure Python implementation (thus use_c=False) then the argument can also be an object that has as callable arguments ‘inner_dist’, ‘result’, and ‘inner_val’. The ‘inner_dist’ function computes the distance between two points (e.g., squared euclidean) and ‘result’ is the function to apply to the final distance (e.g., sqrt when using squared euclidean). You can also inherit from the ‘innerdistance.CustomInnerDist’ class.
use_ndim – Use n-dimensional (aka multivariate) series instead of 1-dimensional series.
use_c – Use the C variant if available.