PermutationTestDistanceBased#

class frouros.callbacks.batch.PermutationTestDistanceBased(num_permutations: int, total_num_permutations: int | None = None, num_jobs: int = -1, method: str = 'auto', random_state: int | None = None, verbose: bool = False, name: str | None = None)#

Permutation test callback class that can be applied to data_drift.batch.distance_based detectors.

Parameters:

num_permutations (int) – number of permutations to obtain the p-value
total_num_permutations (Optional[int]) – total number of permutations to obtain the p-value, defaults to None. If None, the total number of permutations will be set to the maximum number of permutations, the minimum between all possible permutations or the global maximum number of permutations
num_jobs (int) – number of jobs, defaults to -1
method (str) – method to compute the p-value, defaults to “auto”. “auto”: if the number of permutations is greater than the maximum number of permutations, the method will be set to “approximate”. Otherwise, the method will be set to “exact”. “conservative”: p-value is computed as (number of permutations greater or equal than the observed statistic + 1) / (number of permutations + 1). “exact”: p-value is computed as the mean of the binomial cumulative distribution function as stated []. “approximate”: p-value is computed using the integral of the binomial cumulative distribution function as stated []. “estimate”: p-value is computed as the mean of the extreme statistic. p-value can be zero.
random_state (Optional[int]) – random state, defaults to None
verbose (bool) – verbose flag, defaults to False
name (Optional[str]) – name value, defaults to None. If None, the name will be set to PermutationTestDistanceBased.

Note:

Callbacks logs are updated with the following variables:

observed_statistic: observed statistic obtained from the distance-based detector. Same distance value returned by the compare method
permutation_statistic: list of statistics obtained from the permutations
p_value: p-value obtained from the permutation test

References:

[phipson2010permutation]

Phipson, Belinda, and Gordon K. Smyth. “Permutation P-values should never be zero: calculating exact P-values when permutations are randomly drawn.” Statistical applications in genetics and molecular biology 9.1 (2010).

Example:

>>> from frouros.callbacks import PermutationTestDistanceBased
>>> from frouros.detectors.data_drift import MMD
>>> import numpy as np
>>> np.random.seed(seed=31)
>>> X = np.random.multivariate_normal(mean=[1, 1], cov=[[2, 0], [0, 2]], size=100)
>>> Y = np.random.multivariate_normal(mean=[0, 0], cov=[[2, 1], [1, 2]], size=100)
>>> detector = MMD(callbacks=PermutationTestDistanceBased(num_permutations=1000, random_state=31))
>>> _ = detector.fit(X=X)
>>> distance, callbacks_log = detector.compare(X=Y)
>>> distance
DistanceResult(distance=0.05643613752975596)
>>> callbacks_log["PermutationTestDistanceBased"]["p_value"]
0.0009985010823343311

property num_permutations: int#

Number of permutations property.

Returns:: number of permutation to obtain the p-value
Return type:: int

property total_num_permutations: int | None#

Number of total permutations’ property.

Returns:: number of total permutations
Return type:: Optional[int]

property num_jobs: int#

Number of jobs property.

Returns:: number of jobs to use
Return type:: int

property method: str#

Method to compute the p-value property.

Returns:: method to compute the p-value
Return type:: str

property name: str#

Name property.

Returns:: name value
Return type:: str

on_compare_start(X_ref: ndarray, X_test: ndarray) → None#

On compare start method.

Parameters:

X_ref (numpy.ndarray) – reference data
X_test (numpy.ndarray) – test data

on_fit_end(X: ndarray) → None#

On fit end method.

Parameters:: X (numpy.ndarray) – reference data

on_fit_start(X: ndarray) → None#

On fit start method.

Parameters:: X (numpy.ndarray) – reference data

set_detector(detector) → None#: Set detector method.

property verbose: bool#

Verbose flag property.

Returns:: verbose flag
Return type:: bool

on_compare_end(result: Any, X_ref: ndarray, X_test: ndarray) → None#

On compare end method.

Parameters:

result (Any) – result obtained from the compare method
X_ref (numpy.ndarray) – reference data
X_test (numpy.ndarray) – test data

reset() → None#: Reset method.

PermutationTestDistanceBased

Contents

PermutationTestDistanceBased#