interpret_community.common.gpu_kmeans module
The code is based on the similar utility function from SHAP: https://github.com/slundberg/shap/blob/9411b68e8057a6c6f3621765b89b24d82bee13d4/shap/utils/_legacy.py This version makes use of cuml kmeans instead of sklearn for speed.
- interpret_community.common.gpu_kmeans.kmeans(X, k, round_values=True)
Summarize a dataset with k mean samples weighted by the number of data points they each represent.
- Parameters:
X (numpy.ndarray or pandas.DataFrame or any scipy.sparse matrix) – Matrix of data samples to summarize (# samples x # features)
k (int) – Number of means to use for approximation.
round_values (bool) – For all i, round the ith dimension of each mean sample to match the nearest value from X[:,i]. This ensures discrete features always get a valid value.
- Returns:
DenseData object.
- Return type: