interpret_community.explanation.explanation module¶
Defines the explanations that are returned from explaining models.
-
class
interpret_community.explanation.explanation.
BaseExplanation
(method, model_task, model_type=None, explanation_id=None, **kwargs)¶ Bases:
interpret_community.common.chained_identity.ChainedIdentity
The common explanation returned by explainers.
Parameters: - method (str) – The explanation method used to explain the model (e.g., SHAP, LIME).
- model_task (str) – The task of the original model i.e., classification or regression.
- model_type (str) – The type of the original model that was explained, e.g., sklearn.linear_model.LinearRegression.
- explanation_id (str) – The unique identifier for the explanation.
-
data
(key=None)¶ Return the data of the explanation.
Parameters: key (int) – The key for the local data to be retrieved. Returns: The explanation data. Return type: dict
-
model_task
¶ Get the task of the original model, i.e., classification or regression (others possibly in the future).
Returns: The task of the original model. Return type: str
-
model_type
¶ Get the type of the original model that was explained.
Returns: A class name or ‘function’, if that information is available. Return type: str
-
selector
¶ Get the local or global selector.
Returns: The selector as a pandas dataframe of records. Return type: pd.DataFrame
-
visualize
(key=None)¶
-
class
interpret_community.explanation.explanation.
ClassesMixin
(classes=None, num_classes=None, **kwargs)¶ Bases:
object
The explanation mixin for classes.
This mixin is added when you specify classes in the classification scenario for creating a global or local explanation. This is activated when you specify the classes parameter for global or local explanations.
Parameters: classes (list[str]) – Class names as a list of strings. The order of the class names should match that of the model output.
-
class
interpret_community.explanation.explanation.
ExpectedValuesMixin
(expected_values=None, **kwargs)¶ Bases:
object
The explanation mixin for expected values.
Parameters: expected_values (np.array) – The expected values of the model.
-
class
interpret_community.explanation.explanation.
FeatureImportanceExplanation
(features=None, num_features=None, is_raw=False, is_engineered=False, **kwargs)¶ Bases:
interpret_community.explanation.explanation.BaseExplanation
The common feature importance explanation returned by explainers.
Parameters: features (Union[list[str], list[int]]) – The feature names. -
is_engineered
¶ Get the engineered explanation flag.
Returns: True if it’s an engineered explanation (specifically not raw). False if raw or unknown. Return type: bool
-
-
class
interpret_community.explanation.explanation.
GlobalExplanation
(global_importance_values=None, global_importance_rank=None, ranked_global_names=None, ranked_global_values=None, **kwargs)¶ Bases:
interpret_community.explanation.explanation.FeatureImportanceExplanation
The common global explanation returned by explainers.
Parameters: - global_importance_values (numpy.array) – The feature importance values in the order of the original features.
- global_importance_rank (numpy.array) – The feature indexes sorted by importance.
- ranked_global_names (list[str] TODO) – The feature names sorted by importance.
- ranked_global_values (numpy.array) – The feature importance values sorted by importance.
-
data
(key=None)¶ Return the data of the explanation with global importance values added.
Parameters: key (int) – The key for the local data to be retrieved. Returns: The explanation with global importance values added. Return type: dict
-
get_feature_importance_dict
(top_k=None)¶ Get a dictionary pairing ranked global names and feature importance values.
Parameters: top_k (int) – If specified, only the top k names and values will be returned. Returns: A dictionary of feature names and their importance values. Return type: dict{str: float}
-
get_ranked_global_names
(top_k=None)¶ Get feature names sorted by global feature importance values, highest to lowest.
Parameters: top_k (int) – If specified, only the top k names will be returned. Returns: The list of sorted features unless feature names are unavailable, feature indexes otherwise. Return type: list[str] or list[int]
-
get_ranked_global_values
(top_k=None)¶ Get global feature importance sorted from highest to lowest.
Parameters: top_k (int) – If specified, only the top k values will be returned. Returns: The list of sorted values. Return type: list[float]
-
get_raw_explanation
(feature_maps, raw_feature_names=None, eval_data=None)¶ Get raw explanation given input feature maps.
Parameters: - feature_maps (list of numpy arrays or sparse matrices where each array entry (raw_index, generated_index) is the weight for each raw, generated feature pair. The other entries are set to zero. For a sequence of transformations [t1, t2, .., tn] generating generated features from raw features, the list of feature maps correspond to the raw to generated maps in the same order as t1, t2, etc. If the overall raw to generated feature map from t1 to tn is available, then just that feature map in a single element list can be passed.) – list of feature maps from raw to generated feature
- raw_feature_names ([str]) – list of raw feature names
- eval_data (np.ndarray or pd.DataFrame) – Evaluation data.
Returns: raw explanation
Return type:
-
get_raw_feature_importances
(feature_maps)¶ Get global raw feature importance.
Parameters: Returns: Raw feature importances.
Return type:
-
global_importance_rank
¶ Get the overall feature importance rank or indexes.
For example, if original features are [f0, f1, f2, f3] and in global importance order they are [f2, f3, f0, f1], global_importance_rank would be [2, 3, 0, 1].
Returns: The feature indexes sorted by importance. Return type: list[int]
-
global_importance_values
¶ Get the global feature importance values.
Values will be in their original order, the same as features, unless top_k was passed into upload_model_explanation or download_model_explanation. In those cases, returns the most important k values in highest to lowest importance order.
Returns: The model level feature importance values. Return type: list[float]
-
selector
¶ Get the global selector if this is only a global explanation otherwise local.
Returns: The selector as a pandas dataframe of records. Return type: pd.DataFrame
-
class
interpret_community.explanation.explanation.
LocalExplanation
(local_importance_values=None, **kwargs)¶ Bases:
interpret_community.explanation.explanation.FeatureImportanceExplanation
The common local explanation returned by explainers.
Parameters: local_importance_values (numpy.array or scipy.sparse.csr_matrix or list[scipy.sparse.csr_matrix]) – The feature importance values. -
data
(key=None)¶ Return the data of the explanation with local importance values added.
Parameters: key (int) – The key for the local data to be retrieved. Returns: The explanation with local importance values metadata added. Return type: dict
-
get_local_importance_rank
()¶ Get local feature importance rank or indexes.
For example, if original features are [f0, f1, f2, f3] and in local importance order for the first data point they are [f2, f3, f0, f1], local_importance_rank[0] would be [2, 3, 0, 1] (or local_importance_rank[0][0] if classification).
For documentation regarding order of classes in the classification case, please see the docstring for local_importance_values.
Returns: The feature indexes sorted by importance. Return type: list[list[int]] or list[list[list[int]]]
-
get_ranked_local_names
(top_k=None)¶ Get feature names sorted by local feature importance values, highest to lowest.
For documentation regarding order of classes in the classification case, please see the docstring for local_importance_values.
Parameters: top_k (int) – If specified, only the top k names will be returned. Returns: The list of sorted features unless feature names are unavailable, feature indexes otherwise. Return type: list[list[int or str]] or list[list[list[int or str]]]
-
get_ranked_local_values
(top_k=None)¶ Get local feature importance sorted from highest to lowest.
For documentation regarding order of classes in the classification case, please see the docstring for local_importance_values.
Parameters: top_k (int) – If specified, only the top k values will be returned. Returns: The list of sorted values. Return type: list[list[float]] or list[list[list[float]]]
-
get_raw_explanation
(feature_maps, raw_feature_names=None, eval_data=None)¶ Get raw explanation using input feature maps.
Parameters: - feature_maps (list of numpy arrays or sparse matrices where each array entry (raw_index, generated_index) is the weight for each raw, generated feature pair. The other entries are set to zero. For a sequence of transformations [t1, t2, .., tn] generating generated features from raw features, the list of feature maps correspond to the raw to generated maps in the same order as t1, t2, etc. If the overall raw to generated feature map from t1 to tn is available, then just that feature map in a single element list can be passed) – list of feature maps from raw to generated feature
- raw_feature_names ([str]) – list of raw feature names
- eval_data (np.ndarray or pd.DataFrame) – Evaluation data.
Returns: raw explanation
Return type:
-
get_raw_feature_importances
(raw_to_output_maps)¶ Get local raw feature importance.
For documentation regarding order of classes in the classification case, please see the docstring for local_importance_values.
Parameters: raw_to_output_maps (list[numpy.array]) – A list of feature maps from raw to generated feature. Returns: Raw feature importance. Return type: list[list] or list[list[list]]
-
is_local_sparse
¶ Determines whether the local importance values are sparse.
Returns: True if the local importance values are sparse. Return type: bool
-
local_importance_values
¶ Get the feature importance values in original order.
Returns: For a model with a single output such as regression, this returns a list of feature importance values for each data point. For models with vector outputs this function returns a list of such lists, one for each output. The dimension of this matrix is (# examples x # features) or (# classes x # examples x # features). In the classification case, the order of classes is the order of the numeric indices that the classifier outputs. For example, if your target values are [2, 2, 0, 1, 2, 1, 0], where 0 is “dog”, 1 is “cat”, and 2 is “fish”, the first 2d matrix of importance values will be for “dog”, the second will be for “cat”, and the last will be for “fish”. If you choose to pass in a classes array to the explainer, the names should be passed in using this same order.
Return type: list[list[float]] or list[list[list[float]]] or scipy.sparse.csr_matrix or list[scipy.sparse.csr_matrix]
-
num_examples
¶ Get the number of examples on the explanation.
Returns: The number of examples on the explanation. Return type: int
-
selector
¶ Get the local selector.
Returns: The selector as a pandas dataframe of records. Return type: pd.DataFrame
-
-
class
interpret_community.explanation.explanation.
PerClassMixin
(per_class_values=None, per_class_rank=None, ranked_per_class_names=None, ranked_per_class_values=None, **kwargs)¶ Bases:
interpret_community.explanation.explanation.ClassesMixin
The explanation mixin for per class aggregated information.
This mixin is added for the classification scenario for global explanations. The per class importance values are group averages of local importance values across different classes.
Parameters: - per_class_values (numpy.array) – The feature importance values for each class in the order of the original features.
- per_class_importance_rank (numpy.array) – The feature indexes for each class sorted by importance.
- ranked_per_class_names (list[str]) – The feature names for each class sorted by importance.
- ranked_per_class_values (numpy.array) – The feature importance values sorted by importance.
-
get_ranked_per_class_names
(top_k=None)¶ Get feature names sorted by per class feature importance values, highest to lowest.
For documentation regarding order of classes, please see the docstring for per_class_values.
Parameters: top_k (int) – If specified, only the top k names will be returned. Returns: The list of sorted features unless feature names are unavailable, feature indexes otherwise. Return type: list[list[str]] or list[list[int]]
-
get_ranked_per_class_values
(top_k=None)¶ Get per class feature importance sorted from highest to lowest.
For documentation regarding order of classes, please see the docstring for per_class_values.
Parameters: top_k (int) – If specified, only the top k values will be returned. Returns: The list of sorted values. Return type: list[list[float]]
-
per_class_rank
¶ Get the per class importance rank or indexes.
For example, if original features are [f0, f1, f2, f3] and in per class importance order they are [[f2, f3, f0, f1], [f0, f2, f3, f1]], per_class_rank would be [[2, 3, 0, 1], [0, 2, 3, 1]].
For documentation regarding order of classes, please see the docstring for per_class_values.
Returns: The per class indexes that would sort per_class_values. Return type: list
-
per_class_values
¶ Get the per class importance values.
Values will be in their original order, the same as features, unless top_k was passed into upload_model_explanation or download_model_explanation. In those cases, returns the most important k values in highest to lowest importance order.
The order of classes in the output is the order of the numeric indices that the classifier outputs. For example, if your target values are [2, 2, 0, 1, 2, 1, 0], where 0 is “dog”, 1 is “cat”, and 2 is “fish”, the first 2d matrix of importance values will be for “dog”, the second will be for “cat”, and the last will be for “fish”. If you choose to pass in a classes array to the explainer, the names should be passed in using this same order.
Returns: The model level per class feature importance values in original feature order. Return type: list
-
interpret_community.explanation.explanation.
load_explanation
(path)¶
-
interpret_community.explanation.explanation.
save_explanation
(explanation, path, exist_ok=False)¶ Serialize the explanation.
Parameters: - explanation (Explanation) – The Explanation to be serialized.
- path (str) – The path to the directory in which the explanation will be saved. By default, must be a new directory to avoid overwriting any previous explanations. Set exist_ok to True to overrule this behavior.
- exist_ok (bool) – If False (default), the path provided by the user must not already exist and will be created by this function. If True, a prexisting path may be passed. Any preexisting files whose names match those of the files that make up the explanation will be overwritten.
Returns: JSON-formatted explanation data.
Return type: