interpret_community.common.gpu_kmeans module¶
The code is based on the similar utility function from SHAP: https://github.com/slundberg/shap/blob/9411b68e8057a6c6f3621765b89b24d82bee13d4/shap/utils/_legacy.py This version makes use of cuml kmeans instead of sklearn for speed.
- class interpret_community.common.gpu_kmeans.DenseData(data, group_names, *args)¶
- interpret_community.common.gpu_kmeans.kmeans(X, k, round_values=True)¶
Summarize a dataset with k mean samples weighted by the number of data points they each represent. Parameters ———- X : numpy.array or pandas.DataFrame or any scipy.sparse matrix
Matrix of data samples to summarize (# samples x # features)
- kint
Number of means to use for approximation.
- round_valuesbool
For all i, round the ith dimension of each mean sample to match the nearest value from X[:,i]. This ensures discrete features always get a valid value.
DenseData object.