interpret_community.mimic.models.linear_model module

Defines an explainable linear model.

class interpret_community.mimic.models.linear_model.LinearExplainableModel(multiclass=False, random_state=123, classification=True, sparse_data=False, **kwargs)

Bases: interpret_community.mimic.models.explainable_model.BaseExplainableModel

available_explanations = ['global', 'local']

Use LinearExplainer to get the expected values.

Returns:The expected values of the linear model.
Return type:list

Call coef to get the global feature importances from the linear surrogate model.

Returns:The global explanation of feature importances.
Return type:list
explain_local(evaluation_examples, **kwargs)

Use LinearExplainer to get the local feature importances from the trained explainable model.

Parameters:evaluation_examples (numpy or scipy array) – The evaluation examples to compute local feature importances for.
Returns:The local explanation of feature importances.
Return type:Union[list, numpy.ndarray]
static explainable_model_type(self)

Retrieve the model type.

Returns:Linear explainable model type.
Return type:ExplainableModelType
explainer_type = 'model'

Linear explainable model.

  • multiclass (bool) – Set to true to generate a multiclass model.
  • random_state (int) – Int to seed the model.
  • classification (bool) – Indicates whether the model is used for classification or regression scenario.
  • sparse_data (bool) – Indicates whether the training data will be sparse.
fit(dataset, labels, **kwargs)

Call linear fit to fit the explainable model.

Store the mean and covariance of the background data for local explanation.

param dataset:The dataset to train the model on.
type dataset:numpy or scipy array
param labels:The labels to train the model on.
type labels:numpy or scipy array

If multiclass=True, uses the parameters for LogisticRegression:

Fit the model according to the given training data.


X : {arraylike, sparse matrix} of shape (n_samples, n_features)
Training vector, where n_samples is the number of samples and n_features is the number of features.
y : arraylike of shape (n_samples,)
Target vector relative to X.
sample_weight : arraylike of shape (n_samples,) default=None

Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.

New in version 0.17: sample_weight support to LogisticRegression.


Fitted estimator.


The SAGA solver supports both float64 and float32 bit arrays.

Otherwise, if multiclass=False, uses the parameters for LinearRegression:

Fit linear model.


X : {arraylike, sparse matrix} of shape (n_samples, n_features)
Training data
y : arraylike of shape (n_samples,) or (n_samples, n_targets)
Target values. Will be cast to X’s dtype if necessary
sample_weight : arraylike of shape (n_samples,), default=None

Individual weights for each sample

New in version 0.17: parameter sample_weight support to LinearRegression.


self : returns an instance of self.


Retrieve the underlying model.

Returns:The linear model, either classifier or regressor.
Return type:Union[LogisticRegression, LinearRegression]
predict(dataset, **kwargs)

Call linear predict to predict labels using the explainable model.

param dataset:The dataset to predict on.
type dataset:numpy or scipy array
return:The predictions of the model.

If multiclass=True, uses the parameters for LogisticRegression:

Predict class labels for samples in X.


X : arraylike or sparse matrix, shape (n_samples, n_features)


C : array, shape [n_samples]
Predicted class label per sample.

Otherwise, if multiclass=False, uses the parameters for LinearRegression:

Predict using the linear model.


X : arraylike or sparse matrix, shape (n_samples, n_features)


C : array, shape (n_samples,)
Returns predicted values.
predict_proba(dataset, **kwargs)

Call linear predict_proba to predict probabilities using the explainable model.

param dataset:The dataset to predict probabilities on.
type dataset:numpy or scipy array
return:The predictions of the model.

If multiclass=True, uses the parameters for LogisticRegression:

Probability estimates.

The returned estimates for all classes are ordered by the label of classes.

For a multi_class problem, if multi_class is set to be “multinomial” the softmax function is used to find the predicted probability of each class. Else use a onevsrest approach, i.e calculate the probability of each class assuming it to be positive using the logistic function. and normalize these values across all the classes.


X : arraylike of shape (n_samples, n_features)
Vector to be scored, where n_samples is the number of samples and n_features is the number of features.


T : arraylike of shape (n_samples, n_classes)
Returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.

Otherwise predict_proba is not supported for regression or binary classification.

class interpret_community.mimic.models.linear_model.LinearExplainer(model, data, feature_dependence='interventional')

Bases: sphinx.ext.autodoc.importer._MockObject

Linear explainer with support for sparse data and sparse output.


Estimate the SHAP values for a set of samples.

Parameters:evaluation_examples (numpy or scipy array) – The evaluation examples.
Returns:For models with a single output this returns a matrix of SHAP values (# samples x # features). Each row sums to the difference between the model output for that sample and the expected value of the model output (which is stored as expected_value attribute of the explainer).
Return type:Union[list, numpy.ndarray]
class interpret_community.mimic.models.linear_model.SGDExplainableModel(multiclass=False, random_state=123, classification=True, **kwargs)

Bases: interpret_community.mimic.models.explainable_model.BaseExplainableModel

available_explanations = ['global', 'local']

Use LinearExplainer to get the expected values.

Returns:The expected values of the linear model.
Return type:list

Call coef to get the global feature importances from the SGD surrogate model.

Returns:The global explanation of feature importances.
Return type:list
explain_local(evaluation_examples, **kwargs)

Use LinearExplainer to get the local feature importances from the trained explainable model.

Parameters:evaluation_examples (numpy or scipy array) – The evaluation examples to compute local feature importances for.
Returns:The local explanation of feature importances.
Return type:Union[list, numpy.ndarray]
explainer_type = 'model'

Stochastic Gradient Descent explainable model.

  • multiclass (bool) – Set to true to generate a multiclass model.
  • random_state (int) – Int to seed the model.
fit(dataset, labels, **kwargs)

Call linear fit to fit the explainable model.

Store the mean and covariance of the background data for local explanation.

param dataset:The dataset to train the model on.
type dataset:numpy or scipy array
param labels:The labels to train the model on.
type labels:numpy or scipy array

If multiclass=True, uses the parameters for SGDClassifier: Fit linear model with Stochastic Gradient Descent.


X : {arraylike, sparse matrix}, shape (n_samples, n_features)
Training data.
y : ndarray of shape (n_samples,)
Target values.
coef_init : ndarray of shape (n_classes, n_features), default=None
The initial coefficients to warmstart the optimization.
intercept_init : ndarray of shape (n_classes,), default=None
The initial intercept to warmstart the optimization.
sample_weight : arraylike, shape (n_samples,), default=None
Weights applied to individual samples. If not provided, uniform weights are assumed. These weights will be multiplied with class_weight (passed through the constructor) if class_weight is specified.


self :
Returns an instance of self.

Otherwise, if multiclass=False, uses the parameters for SGDRegressor: Fit linear model with Stochastic Gradient Descent.


X : {arraylike, sparse matrix}, shape (n_samples, n_features)
Training data
y : ndarray of shape (n_samples,)
Target values
coef_init : ndarray of shape (n_features,), default=None
The initial coefficients to warmstart the optimization.
intercept_init : ndarray of shape (1,), default=None
The initial intercept to warmstart the optimization.
sample_weight : arraylike, shape (n_samples,), default=None
Weights applied to individual samples (1. for unweighted).


self : returns an instance of self.


Retrieve the underlying model.

Returns:The SGD model, either classifier or regressor.
Return type:Union[SGDClassifier, SGDRegressor]
predict(dataset, **kwargs)

Call SGD predict to predict labels using the explainable model.

param dataset:The dataset to predict on.
type dataset:numpy or scipy array
return:The predictions of the model.

If multiclass=True, uses the parameters for SGDClassifier:

Predict class labels for samples in X.


X : arraylike or sparse matrix, shape (n_samples, n_features)


C : array, shape [n_samples]
Predicted class label per sample.

Otherwise, if multiclass=False, uses the parameters for SGDRegressor: Predict using the linear model


X : {arraylike, sparse matrix}, shape (n_samples, n_features)


ndarray of shape (n_samples,)
Predicted target values per element in X.
predict_proba(dataset, **kwargs)

Call SGD predict_proba to predict probabilities using the explainable model.

param dataset:The dataset to predict probabilities on.
type dataset:numpy or scipy array
return:The predictions of the model.

If multiclass=True, uses the parameters for SGDClassifier: Probability estimates.

This method is only available for log loss and modified Huber loss.

Multiclass probability estimates are derived from binary ( estimates by simple normalization, as recommended by Zadrozny and Elkan.

Binary probability estimates for loss=”modified_huber” are given by (clip(decision_function(X), 1, 1) + 1) / 2. For other loss functions it is necessary to perform proper probability calibration by wrapping the classifier with CalibratedClassifierCV instead.


X : {arraylike, sparse matrix}, shape (n_samples, n_features)
Input data for prediction.


ndarray of shape (n_samples, n_classes)
Returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.


Zadrozny and Elkan, “Transforming classifier scores into multiclass probability estimates”, SIGKDD’02,

The justification for the formula in the loss=”modified_huber” case is in the appendix B in:

Otherwise predict_proba is not supported for regression or binary classification.