interpret_community.common.gpu_kmeans module¶

The code is based on the similar utility function from SHAP: https://github.com/slundberg/shap/blob/9411b68e8057a6c6f3621765b89b24d82bee13d4/shap/utils/_legacy.py This version makes use of cuml kmeans instead of sklearn for speed.

class interpret_community.common.gpu_kmeans.Data¶: Bases: object

class interpret_community.common.gpu_kmeans.DenseData(data, group_names, *args)¶: Bases: interpret_community.common.gpu_kmeans.Data

interpret_community.common.gpu_kmeans.kmeans(X, k, round_values=True)¶

Summarize a dataset with k mean samples weighted by the number of data points they each represent. Parameters ———- X : numpy.array or pandas.DataFrame or any scipy.sparse matrix

Matrix of data samples to summarize (# samples x # features)

kint: Number of means to use for approximation.
round_valuesbool: For all i, round the ith dimension of each mean sample to match the nearest value from X[:,i]. This ensures discrete features always get a valid value.

DenseData object.