interpret_community.lime.lime_explainer module

Defines the LIMEExplainer for computing explanations on black box models using LIME.

class interpret_community.lime.lime_explainer.LIMEExplainer(model, initialization_examples, is_function=False, explain_subset=None, nclusters=10, features=None, classes=None, verbose=False, categorical_features=[], show_progress=True, transformations=None, allow_all_transformations=False, model_task=ModelTask.Unknown, **kwargs)

Bases: interpret_community.common.blackbox_explainer.BlackBoxExplainer

available_explanations = ['global', 'local']
explain_global(evaluation_examples, sampling_policy=None, include_local=True, batch_size=100)

Explain the model globally by aggregating local explanations to global.

Parameters
  • evaluation_examples (numpy.ndarray or pandas.DataFrame or scipy.sparse.csr_matrix) – A matrix of feature vector examples (# examples x # features) on which to explain the model’s output.

  • sampling_policy (interpret_community.common.policy.SamplingPolicy) – Optional policy for sampling the evaluation examples. See documentation on SamplingPolicy for more information.

  • include_local (bool) – Include the local explanations in the returned global explanation. If include_local is False, will stream the local explanations to aggregate to global.

  • batch_size (int) – If include_local is False, specifies the batch size for aggregating local explanations to global.

Returns

A model explanation object containing the global explanation.

Return type

GlobalExplanation

explain_local(evaluation_examples)

Explain the function locally by using LIME.

Parameters
  • evaluation_examples (ml_wrappers.dataset.dataset_wrapper.DatasetWrapper) – A matrix of feature vector examples (# examples x # features) on which to explain the model’s output.

  • features (list[str]) – A list of feature names.

  • classes (list[str]) – Class names as a list of strings. The order of the class names should match that of the model output. Only required if explaining classifier.

Returns

A model explanation object containing the local explanation.

Return type

LocalExplanation

explainer_type = 'blackbox'

Defines the LIME Explainer for explaining black box models or functions.

Parameters
  • model (object) – The model to explain or function if is_function is True. A model that implements sklearn.predict or sklearn.predict_proba or function that accepts a 2d ndarray.

  • initialization_examples (numpy.ndarray or pandas.DataFrame or scipy.sparse.csr_matrix) – A matrix of feature vector examples (# examples x # features) for initializing the explainer.

  • is_function (bool) – Default set to false, set to True if passing function instead of model.

  • explain_subset (list[int]) – List of feature indices. If specified, only selects a subset of the features in the evaluation dataset for explanation. The subset can be the top-k features from the model summary.

  • nclusters (int) – Number of means to use for approximation. A dataset is summarized with nclusters mean samples weighted by the number of data points they each represent. When the number of initialization examples is larger than (10 x nclusters), those examples will be summarized with k-means where k = nclusters.

  • features (list[str]) – A list of feature names.

  • classes (list[str]) – Class names as a list of strings. The order of the class names should match that of the model output. Only required if explaining classifier.

  • verbose (bool) – If true, uses verbose logging in LIME.

  • categorical_features (Union[list[str], list[int]]) – Categorical feature names or indexes. If names are passed, they will be converted into indexes first.

  • show_progress (bool) – Default to ‘True’. Determines whether to display the explanation status bar when using LIMEExplainer.

  • transformations – sklearn.compose.ColumnTransformer or a list of tuples describing the column name and

transformer. When transformations are provided, explanations are of the features before the transformation. The format for list of transformations is same as the one here: https://github.com/scikit-learn-contrib/sklearn-pandas.

If the user is using a transformation that is not in the list of sklearn.preprocessing transformations that we support then we cannot take a list of more than one column as input for the transformation. A user can use the following sklearn.preprocessing transformations with a list of columns since these are already one to many or one to one: Binarizer, KBinsDiscretizer, KernelCenterer, LabelEncoder, MaxAbsScaler, MinMaxScaler, Normalizer, OneHotEncoder, OrdinalEncoder, PowerTransformer, QuantileTransformer, RobustScaler, StandardScaler.

Examples for transformations that work:

[
    (["col1", "col2"], sklearn_one_hot_encoder),
    (["col3"], None) #col3 passes as is
]
[
    (["col1"], my_own_transformer),
    (["col2"], my_own_transformer),
]

Example of transformations that would raise an error since it cannot be interpreted as one to many:

[
    (["col1", "col2"], my_own_transformer)
]

This would not work since it is hard to make out whether my_own_transformer gives a many to many or one to many mapping when taking a sequence of columns. :type transformations: sklearn.compose.ColumnTransformer or list[tuple] :param allow_all_transformations: Allow many to many and many to one transformations :type allow_all_transformations: bool :param model_task: Optional parameter to specify whether the model is a classification or regression model.

In most cases, the type of the model can be inferred based on the shape of the output, where a classifier has a predict_proba method and outputs a 2 dimensional array, while a regressor has a predict method and outputs a 1 dimensional array.