Python API Reference
KMeans clustering algorithm implemented in C.
- class kmeans.KMeans(n_clusters, max_iter=100, tol=0.0001)[source]
Bases:
objectK-Means clustering.
- Parameters:
- fit(X)[source]
Compute k-means clustering.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
- Returns:
self – Fitted estimator.
- Return type:
- kmeans.kmeans(data, k, max_iterations=100, tolerance=0.0001)[source]
Perform k-means clustering on the given data.
- Parameters:
- Returns:
centroids (ndarray of shape (k, n_features)) – The final cluster centroids.
labels (ndarray of shape (n_samples,)) – Index of the cluster each sample belongs to.
Functional API
Object-Oriented API
KMeans Class
- class kmeans.KMeans(n_clusters, max_iter=100, tol=0.0001)[source]
Bases:
objectK-Means clustering.
- Parameters:
- fit(X)[source]
Compute k-means clustering.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
- Returns:
self – Fitted estimator.
- Return type:
- predict(X)[source]
Predict the closest cluster for each sample.
- Parameters:
X (array-like of shape (n_samples, n_features)) – New data to predict.
- Returns:
labels – Index of the cluster each sample belongs to.
- Return type:
ndarray of shape (n_samples,)
- fit_predict(X)[source]
Compute clustering and return cluster labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
- Returns:
labels – Index of the cluster each sample belongs to.
- Return type:
ndarray of shape (n_samples,)
- fit(X)[source]
Compute k-means clustering.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
- Returns:
self – Fitted estimator.
- Return type:
Attributes
After calling fit(), the following attributes are available:
- kmeans.centroids_
- Type:
numpy.ndarray of shape (n_clusters, n_features)
Coordinates of cluster centers.
- kmeans.labels_
- Type:
numpy.ndarray of shape (n_samples,)
Labels of each point indicating cluster assignment.
C Extension Module
Note
The _kmeans module is a low-level C extension. Most users should use
the high-level Python API instead.
- kmeans._kmeans.fit(data, k, max_iterations, tolerance)
Low-level k-means fitting function.
- Parameters:
data (numpy.ndarray) – Input data array (n_samples, n_features)
k (int) – Number of clusters
max_iterations (int) – Maximum iterations
tolerance (float) – Convergence tolerance
- Returns:
Tuple of (centroids, labels)
- Return type:
- kmeans._kmeans.predict(data, centroids)
Predict cluster labels for data points.
- Parameters:
data (numpy.ndarray) – Input data array (n_samples, n_features)
centroids (numpy.ndarray) – Cluster centroids (k, n_features)
- Returns:
Cluster labels
- Return type:
Examples
See Examples for more usage examples.