suod.utils package#

Submodules#

suod.utils.utility module#

suod.utils.utility.build_codes(base_estimators, clf_list, ng_clf_list, flag_global)[source]#

Core function for building codes for deciding whether enable random projection and supervised approximation.

Parameters#

base_estimators: list, length must be greater than 1

A list of base estimators. Certain methods must be present, e.g., fit and predict.

clf_listlist

The list of outlier detection models to use a certain function. The detector name should be consistent with PyOD.

ng_clf_listlist

The list of outlier detection models to NOT use a certain function. The detector name should be consistent with PyOD.

flag_globalbool

The global flag to override the code build.

Returns#

suod.utils.utility.get_estimators(contamination=0.1)[source]#

Internal method to create a list of 600 base outlier detectors.

Parameters#

contaminationfloat in (0., 0.5), optional (default=0.1)

The amount of contamination of the data set, i.e. the proportion of outliers in the data set. Used when fitting to define the threshold on the decision function.

Returns#

base_detectorslist

A list of initialized random base outlier detectors.

suod.utils.utility.get_estimators_small(contamination=0.1)[source]#

Internal method to create a list of 600 base outlier detectors.

Parameters#

contaminationfloat in (0., 0.5), optional (default=0.1)

The amount of contamination of the data set, i.e. the proportion of outliers in the data set. Used when fitting to define the threshold on the decision function.

Returns#

base_detectorslist

A list of initialized random base outlier detectors.

suod.utils.utility.raw_score_to_proba(decision_scores, test_scores, method='linear')[source]#

Utility function to convert raw scores to probability. The transformation can be either linear or using unify introduced in [BKKSZ11].

Parameters#

decision_scoresnumpy array of shape (n_samples,)

The outlier scores of the training data. The higher, the more abnormal. Outliers tend to have higher scores. This value is available once the detector is fitted.

test_scoresnumpy array of shape (n_samples,)

The outlier scores of the test data to be converted. The higher, the more abnormal. Outliers tend to have higher scores. This value is available once the detector is fitted.

methodstr, optional (default=’linear’)

The transformation method, either ‘linear’ or ‘unify’

Returns#

outlier_probabilitynumpy array of shape (n_samples,)

For each observation, tells whether or not it should be considered as an outlier according to the fitted model. Return the outlier probability, ranging in [0,1].

Module contents#