interpret_community.mimic.mimic_explainer module

Defines the Mimic Explainer for computing explanations on black box models or functions.

The mimic explainer trains an explainable model to reproduce the output of the given black box model. The explainable model is called a surrogate model and the black box model is called a teacher model. Once trained to reproduce the output of the teacher model, the surrogate model’s explanation can be used to explain the teacher model.

class interpret_community.mimic.mimic_explainer.MimicExplainer(model, initialization_examples, explainable_model, explainable_model_args=None, is_function=False, augment_data=True, max_num_of_augmentations=10, explain_subset=None, features=None, classes=None, transformations=None, allow_all_transformations=False, shap_values_output=ShapValuesOutput.DEFAULT, categorical_features=None, model_task=ModelTask.Unknown, reset_index=ResetIndex.Ignore, **kwargs)

Bases: interpret_community.common.blackbox_explainer.BlackBoxExplainer

available_explanations = ['global', 'local']
explain_global(evaluation_examples=None, include_local=True, batch_size=100)

Globally explains the blackbox model using the surrogate model.

If evaluation_examples are unspecified, retrieves global feature importance from explainable surrogate model. Note this will not include per class feature importance. If evaluation_examples are specified, aggregates local explanations to global from the given evaluation_examples - which computes both global and per class feature importance.

Parameters
  • evaluation_examples (numpy.ndarray or pandas.DataFrame or scipy.sparse.csr_matrix) – A matrix of feature vector examples (# examples x # features) on which to explain the model’s output. If specified, computes feature importance through aggregation.

  • include_local (bool) – Include the local explanations in the returned global explanation. If evaluation examples are specified and include_local is False, will stream the local explanations to aggregate to global.

  • batch_size (int) – If include_local is False, specifies the batch size for aggregating local explanations to global.

Returns

A model explanation object. It is guaranteed to be a GlobalExplanation. If evaluation_examples are passed in, it will also have the properties of a LocalExplanation. If the model is a classifier (has predict_proba), it will have the properties of ClassesMixin, and if evaluation_examples were passed in it will also have the properties of PerClassMixin.

Return type

DynamicGlobalExplanation

explain_local(evaluation_examples)

Locally explains the blackbox model using the surrogate model.

Parameters

evaluation_examples (numpy.ndarray or pandas.DataFrame or scipy.sparse.csr_matrix) – A matrix of feature vector examples (# examples x # features) on which to explain the model’s output.

Returns

A model explanation object. It is guaranteed to be a LocalExplanation. If the model is a classifier, it will have the properties of the ClassesMixin.

Return type

DynamicLocalExplanation

explainer_type = 'blackbox'

The Mimic Explainer for explaining black box models or functions.

Parameters
  • model (object) – The black box model or function (if is_function is True) to be explained. Also known as the teacher model. A model that implements sklearn.predict or sklearn.predict_proba or function that accepts a 2d ndarray.

  • initialization_examples (numpy.ndarray or pandas.DataFrame or scipy.sparse.csr_matrix) – A matrix of feature vector examples (# examples x # features) for initializing the explainer.

  • explainable_model (interpret_community.mimic.models.BaseExplainableModel) – The uninitialized surrogate model used to explain the black box model. Also known as the student model.

  • explainable_model_args (dict) – An optional map of arguments to pass to the explainable model for initialization.

  • is_function (bool) – Default is False. Set to True if passing function instead of model.

  • augment_data (bool) – If True, oversamples the initialization examples to improve surrogate model accuracy to fit teacher model. Useful for high-dimensional data where the number of rows is less than the number of columns.

  • max_num_of_augmentations (int) – Maximum number of times we can increase the input data size.

  • explain_subset (list[int]) – List of feature indices. If specified, only selects a subset of the features in the evaluation dataset for explanation. Note for mimic explainer this will not affect the execution time of getting the global explanation. This argument is not supported when transformations are set.

  • features (list[str]) – A list of feature names.

  • classes (list[str]) – Class names as a list of strings. The order of the class names should match that of the model output. Only required if explaining classifier.

  • transformations (sklearn.compose.ColumnTransformer or list[tuple]) –

    sklearn.compose.ColumnTransformer or a list of tuples describing the column name and transformer. When transformations are provided, explanations are of the features before the transformation. The format for a list of transformations is same as the one here: https://github.com/scikit-learn-contrib/sklearn-pandas.

    If you are using a transformation that is not in the list of sklearn.preprocessing transformations that are supported by the interpret-community package, then this parameter cannot take a list of more than one column as input for the transformation. You can use the following sklearn.preprocessing transformations with a list of columns since these are already one to many or one to one: Binarizer, KBinsDiscretizer, KernelCenterer, LabelEncoder, MaxAbsScaler, MinMaxScaler, Normalizer, OneHotEncoder, OrdinalEncoder, PowerTransformer, QuantileTransformer, RobustScaler, StandardScaler.

    Examples for transformations that work:

    [
        (["col1", "col2"], sklearn_one_hot_encoder),
        (["col3"], None) #col3 passes as is
    ]
    [
        (["col1"], my_own_transformer),
        (["col2"], my_own_transformer),
    ]
    

    An example of a transformation that would raise an error since it cannot be interpreted as one to many:

    [
        (["col1", "col2"], my_own_transformer)
    ]
    

    The last example would not work since the interpret-community package can’t determine whether my_own_transformer gives a many to many or one to many mapping when taking a sequence of columns.

  • shap_values_output (interpret_community.common.constants.ShapValuesOutput) – The shap values output from the explainer. Only applies to tree-based models that are in terms of raw feature values instead of probabilities. Can be default, probability or teacher_probability. If probability or teacher_probability are specified, we approximate the feature importance values as probabilities instead of using the default values. If teacher probability is specified, we use the probabilities from the teacher model as opposed to the surrogate model.

  • categorical_features (Union[list[str], list[int]]) – Categorical feature names or indexes. If names are passed, they will be converted into indexes first. Note if pandas indexes are categorical, you can either pass the name of the index or the index as if the pandas index was inserted at the end of the input dataframe.

  • allow_all_transformations (bool) – Allow many to many and many to one transformations

  • model_task (str) – Optional parameter to specify whether the model is a classification or regression model. In most cases, the type of the model can be inferred based on the shape of the output, where a classifier has a predict_proba method and outputs a 2 dimensional array, while a regressor has a predict method and outputs a 1 dimensional array.

  • reset_index (str) – Uses the pandas DataFrame index column as part of the features when training the surrogate model.

get_surrogate_model_replication_measure(training_data)

Return the metric which tells how well the surrogate model replicates the teacher model.

For classification scenarios, this function will return accuracy. For regression scenarios, this function will return r2_score.

Parameters

training_data (numpy.ndarray or pandas.DataFrame or scipy.sparse.csr_matrix) – The data for getting the replication metric.

Returns

Metric that tells how well the surrogate model replicates the behavior of teacher model.

Return type

float