interpret_community.permutation.permutation_importance module

Defines the PFIExplainer for computing global explanations on black box models or functions.

The PFIExplainer uses permutation feature importance to compute a score for each column given a model based on how the output metric varies as each column is randomly permuted. Although very fast for computing global explanations, PFI does not support local explanations and can be inaccurate when there are feature interactions.

class interpret_community.permutation.permutation_importance.PFIExplainer(model, is_function=False, metric=None, metric_args=None, is_error_metric=False, explain_subset=None, features=None, classes=None, transformations=None, allow_all_transformations=False, seed=0, for_classifier_use_predict_proba=False, show_progress=True, model_task=<ModelTask.Unknown: 'unknown'>, **kwargs)

Bases: interpret_community.common.base_explainer.GlobalExplainer, interpret_community.common.blackbox_explainer.BlackBoxMixin

available_explanations = ['global']
explain_global(evaluation_examples, true_labels)

Globally explains the blackbox model using permutation feature importance.

Note this will not include per class feature importances or local feature importances.

Parameters:
  • evaluation_examples (numpy.array or pandas.DataFrame or scipy.sparse.csr_matrix) – A matrix of feature vector examples (# examples x # features) on which to explain the model’s output through permutation feature importance.
  • true_labels (numpy.array or pandas.DataFrame) – An array of true labels used for reference to compute the evaluation metric for base case and after each permutation.
Returns:

A model explanation object. It is guaranteed to be a GlobalExplanation. If the model is a classifier (has predict_proba), it will have the properties of ClassesMixin.

Return type:

DynamicGlobalExplanation

explainer_type = 'blackbox'

Defines the Permutation Feature Importance Explainer for explaining black box models or functions.

Parameters:
  • model (object) – The black box model or function (if is_function is True) to be explained. Also known as the teacher model. A model that implements sklearn.predict or sklearn.predict_proba or function that accepts a 2d ndarray.
  • is_function (bool) – Default is False. Set to True if passing function instead of model.
  • metric (str or function that accepts two arrays, y_true and y_pred.) – The metric name or function to evaluate the permutation. Note that if a metric function is provided, a higher value must be better. Otherwise, take the negative of the function or set is_error_metric to True. By default, if no metric is provided, F1 Score is used for binary classification, F1 Score with micro average is used for multiclass classification and mean absolute error is used for regression.
  • metric_args (dict) – Optional arguments for metric function.
  • is_error_metric (bool) – If custom metric function is provided, set to True if a higher value of the metric is better.
  • explain_subset (list[int]) – List of feature indexes. If specified, only selects a subset of the features in the evaluation dataset for explanation. For permutation feature importance, we can shuffle, score and evaluate on the specified indexes when this parameter is set. This argument is not supported when transformations are set.
  • features (list[str]) – A list of feature names.
  • classes (list[str]) – Class names as a list of strings. The order of the class names should match that of the model output. Only required if explaining classifier.
  • transformations (sklearn.compose.ColumnTransformer or list[tuple]) –

    sklearn.compose.ColumnTransformer or a list of tuples describing the column name and transformer. When transformations are provided, explanations are of the features before the transformation. The format for a list of transformations is same as the one here: https://github.com/scikit-learn-contrib/sklearn-pandas.

    If you are using a transformation that is not in the list of sklearn.preprocessing transformations that are supported by the interpret-community package, then this parameter cannot take a list of more than one column as input for the transformation. You can use the following sklearn.preprocessing transformations with a list of columns since these are already one to many or one to one: Binarizer, KBinsDiscretizer, KernelCenterer, LabelEncoder, MaxAbsScaler, MinMaxScaler, Normalizer, OneHotEncoder, OrdinalEncoder, PowerTransformer, QuantileTransformer, RobustScaler, StandardScaler.

    Examples for transformations that work:

    [
        (["col1", "col2"], sklearn_one_hot_encoder),
        (["col3"], None) #col3 passes as is
    ]
    [
        (["col1"], my_own_transformer),
        (["col2"], my_own_transformer),
    ]
    

    An example of a transformation that would raise an error since it cannot be interpreted as one to many:

    [
        (["col1", "col2"], my_own_transformer)
    ]
    

    The last example would not work since the interpret-community package can’t determine whether my_own_transformer gives a many to many or one to many mapping when taking a sequence of columns.

  • allow_all_transformations (bool) – Allow many to many and many to one transformations.
  • seed (int) – Random number seed for shuffling.
  • for_classifier_use_predict_proba (bool) – If specifying a model instead of a function, and the model is a classifier, set to True instead of the default False to use predict_proba instead of predict when calculating the metric.
  • show_progress (bool) – Default to ‘True’. Determines whether to display the explanation status bar when using PFIExplainer.
  • model_task (str) – Optional parameter to specify whether the model is a classification or regression model. In most cases, the type of the model can be inferred based on the shape of the output, where a classifier has a predict_proba method and outputs a 2 dimensional array, while a regressor has a predict method and outputs a 1 dimensional array.
interpret_community.permutation.permutation_importance.labels_decorator(explain_func)

Decorate PFI explainer to throw better error message if true_labels not passed.

Parameters:explain_func (explanation function) – PFI explanation function.