dtaidistance.subsequence.subsequencealignment

(requires version 2.3.0 or higher)

DTW-based subsequence matching.

author:

Wannes Meert

copyright:

Copyright 2021-2023 KU Leuven, DTAI Research Group.

license:

Apache License, Version 2.0, see LICENSE for details.

class dtaidistance.subsequence.subsequencealignment.SAMatch(idx, alignment)

SubsequenceAlignment match

property distance

DTW distance of match.

This value is dependent on the length of the query. Use the value property when comparing queries of different lengths.

property path

Matched path in series

property segment

Matched segment in series.

property value

Normalized DTW distance of match.

Normalization is the DTW distance divided by the query length.

class dtaidistance.subsequence.subsequencealignment.SubsequenceAlignment(query, series, penalty=0.1, use_c=False)

Subsequence alignment using DTW. Find where the query occurs in the series.

Based on Fundamentals of Music Processing, Meinard Müller, Springer, 2015.

Example:

query = np.array([1., 2, 0])
series = np.array([1., 0, 1, 2, 1, 0, 2, 0, 3, 0, 0])
sa = subsequence_search(query, series)
mf = sa.matching_function()
sa.kbest_matches(k=2)
Parameters:
  • query – Subsequence to search for

  • series – Long sequence in which to search

  • penalty – Penalty for non-diagonal matching

  • use_c – Use the C-based DTW function if available

align()
align_fast()
best_match()
best_match_fast()
best_matches(max_rangefactor=2, overlap=0, minlength=2, maxlength=None)

Yields the next best match. Stops when the current match is larger than maxrangefactor times the first match.

Parameters:
  • max_rangefactor – The range between the first (best) match and the last match can be at most a factor of maxrangefactor. For example, if the first match has value v_f, then the last match has a value v_l < v_f*maxfactorrange.

  • overlap – Matches cannot overlap unless overlap > 0.

  • minlength – Minimal length of the matched sequence.

  • maxlength – Maximal length of the matched sequence.

Returns:

best_matches_fast(*args, **kwargs)

See best_matches().

best_matches_knee(alpha=0.3, overlap=0, minlength=2, maxlength=None)

Yields the next best match. Stops when the current match is larger than maxrangefactor times the first match.

Parameters:
  • alpha – The factor for the exponentially moving average that keeps track of the curve to detect the knee. The higher, the more sensitive to recent values (and differences).

  • overlap – Matches cannot overlap unless overlap > 0.

  • minlength – Minimal length of the matched sequence.

  • maxlength – Maximal length of the matched sequence.

Returns:

best_matches_knee_fast(*args, **kwargs)

See best_matches_knee().

get_match(idx)
kbest_matches(k=1, overlap=0, minlength=2, maxlength=None)

Yields the next best match. Stops at k matches (use None for all matches).

Parameters:
  • k – Number of matches to yield. None is all matches.

  • overlap – Matches cannot overlap unless overlap > 0.

  • minlength – Minimal length of the matched sequence. If k is set to None, matches with one value can occur if minlength is set to 1.

  • maxlength – Maximal length of the matched sequence.

Returns:

Yield an SAMatch object

kbest_matches_fast(*args, **kwargs)

See kbest_matches().

matching_function()

The matching score for each end-point of a possible match.

matching_function_bestpath(idx)

Indices in series for best path for match in matching function at idx.

Parameters:

idx – Index in matching function

Returns:

List of (row, col)

matching_function_endpoint(idx)

Index in series for end of match in matching function at idx.

Parameters:

idx – Index in matching function

Returns:

Index in series

matching_function_segment(idx)

Matched segment in series.

matching_function_startpoint(idx)

Index in series for start of match in matching function at idx.

Parameters:

idx – Index in matching function

Returns:

Index in series

reset()
warping_paths()

Get matrix with all warping paths.

If the aligmnent was computed using a compact, the paths are first copied into a full warping paths matrix.

Returns:

Numpy matrix of size (len(query)+1) * (len(series)+1)

dtaidistance.subsequence.subsequencealignment.subsequence_alignment(query, series, penalty=0.1, use_c=False)

See SubsequenceAligment.

Parameters:
  • query

  • series

  • penalty

  • use_c

Returns: