dtaidistance.subsequence.dtw

(requires version 2.3.0 or higher)

DTW-based subsequence matching.

author:Wannes Meert
copyright:Copyright 2021 KU Leuven, DTAI Research Group.
license:Apache License, Version 2.0, see LICENSE for details.
class dtaidistance.subsequence.dtw.LCMatch(lc, row=None, col=None)

LocalConcurrences match

path
class dtaidistance.subsequence.dtw.LocalConcurrences(series1, series2=None, gamma=1, tau=0, delta=0, delta_factor=1, only_triu=False, penalty=None)

Version identification based on local concurrences.

Find recurring patterns across two time series. Used to identify whether one time series is a version of another. If the two time series are the some one, it can be used to find typical or frequent patterns in a time series.

Based on 7.3.2 Identification Procedure in Fundamentals of Music Processing, Meinard Müller, Springer, 2015.

Different from the original formulation, D_tau is introduced based on the given delta factor. This makes the penalty less sensitive to the cumulative effect of the paths in the self-similarity matrix S:

S_tau(n,m) = S(n,m) if S(n,m) >= tau (with tau >= 0)
delta if S(n,m) < tau (with delta <= 0)

And for the accumulated score matrix D:

D_tau(n,m) = max(0,
df * D_tau(n−1,m−1) + S_tau(n,m), df * D_tau(n−1,m) + S_tau(n,m), df * D_tau(n,m−1) + S_tau(n,m))

where df = 1 if S(n,m) >= tau and df=delta_factor (<=1) otherwise,

Parameters:
  • series1 – First time series.
  • series2 – Second time series. If empty, series1 is used and compared with itself.
  • gamma – Affinity transformation exp(-gamma*(s1[i] - s2[j])**2), should be >0
  • tau – threshold parameter, should be >= 0
  • delta – penalty parameter, should be <= 0
  • delta_factor – penalty factor parameter, should be <= 1
  • only_triu – Only consider upper triangular matrix in warping paths.
align()
Returns:
best_match()
best_path(row, col)
estimate_settings_from_std(series, tau_std=0.33)
Parameters:
  • series
  • tau_std – Set tau to differences larger than tau_std time standard deviation of the given series (default is 0.33, or reject differences that are larger than the deviation wrt to the mean of 75% of the values in the series, assuming a normal distribution).
Returns:

kbest_matches(k=1, minlen=2, buffer=0)

Yields the next best match. Stops at k matches (use None for all matches).

Parameters:
  • k – Number of matches to yield. None is all matches.
  • minlen – Consider only matches of length longer than minlen
  • buffer – Matches cannot be closer than buffer to each other.
Returns:

Yield an LCMatch object

reset()
wp
class dtaidistance.subsequence.dtw.SAMatch(idx, alignment)

SubsequenceAlignment match

path

Matched path in series

segment

Matched segment in series.

value
class dtaidistance.subsequence.dtw.SSMatch(idx, ss)
class dtaidistance.subsequence.dtw.SubsequenceAlignment(query, series, penalty=0.1, use_c=False)

Subsequence alignment using DTW. Find where the query occurs in the series.

Based on Fundamentals of Music Processing, Meinard Müller, Springer, 2015.

Example: query = np.array([1., 2, 0]) series = np.array([1., 0, 1, 2, 1, 0, 2, 0, 3, 0, 0]) sa = subsequence_search(query, series) mf = sa.matching_function() sa.kbest_matches(k=2)

Parameters:
  • query – Subsequence to search for
  • series – Long sequence in which to search
  • penalty – Penalty for non-diagonal matching
  • use_c – Use the C-based DTW function if available
align()
align_fast()
best_match()
get_match(idx)
kbest_matches(k=1, overlap=0)

Yields the next best match. Stops at k matches (use None for all matches).

Parameters:
  • k – Number of matches to yield. None is all matches.
  • overlap – Matches cannot overlap unless overlap > 0.
Returns:

Yield an SAMatch object

matching_function()

The matching score for each end-point of a possible match.

matching_function_bestpath(idx)

Indices in series for best path for match in matching function at idx.

Parameters:idx – Index in matching function
Returns:List of (row, col)
matching_function_endpoint(idx)

Index in series for end of match in matching function at idx.

Parameters:idx – Index in matching function
Returns:Index in series
matching_function_segment(idx)

Matched segment in series.

matching_function_startpoint(idx)

Index in series for start of match in matching function at idx.

Parameters:idx – Index in matching function
Returns:Index in series
reset()
warping_paths()

Get matrix with all warping paths.

If the aligmnent was computed using a compact, the paths are first copied into a full warping paths matrix.

Returns:Numpy matrix of size (len(query)+1) * (len(series)+1)
class dtaidistance.subsequence.dtw.SubsequenceSearch(query, s, dists_options=None)

Search the best matching (subsequence) time series compared to a given time series.

Parameters:
  • query – Time series to search for
  • s – Iterator over time series to perform search on. This can be for example windows over a long time series.
  • dists_options – Options for DTW
align(k=None)
best_match()
kbest_matches(k=1)
reset()
dtaidistance.subsequence.dtw.local_concurrences(series1, series2=None, gamma=1, tau=0, delta=0, delta_factor=1, estimate_settings=None, only_triu=False, penalty=None)

Local concurrences, see LocalConcurrences.

Parameters:
  • series1
  • series2
  • gamma – Affinity transformation exp(-gamma*(s1[i] - s2[j])**2)
  • tau – threshold parameter
  • delta – penalty parameter Should be negative. Added instead of the affinity score (if score below tau threshold parameter).
  • delta_factor – multiply cumulative score (e.g. by 0.5). This is useful to have the same impact at different locations in the warping paths matrix, which is cumulative (and thus typically large in one corner and small in the opposite corner).
  • estimate_settings – Estimate tau, delta, delta_factor from given series. Will be passed as tau_std to estimate_settings_from_std.
  • only_triu – Only compute the upper traingle matrix values. Useful to avoid redundant computations when series1 is equal to series2 (or equivalently if series2 is None).
  • penalty – Penalty that is added when dynamic programming is using moving vertically or horizontally through the matrix instead of diagonally. Used to prefer diagonal paths.
Returns:

dtaidistance.subsequence.dtw.subsequence_alignment(query, series, use_c=False)

See SubsequenceAligment.

Parameters:
  • query
  • series
Returns:

See SubsequenceSearch.

Parameters:
  • query
  • series
Returns: