📚 PracticeMediumML CodingCoding Ready
Implement K-Means Clustering
ML Coding: ml-algorithms | 35-45 minutes
clusteringunsupervisedclassic-mlnumpyoptimization
Updated Dec 20, 2025
Question
Problem
Implement the K-Means clustering algorithm from scratch. Given a dataset of points and a number k, partition the points into k clusters by iteratively:
- Assigning each point to the nearest centroid
- Updating centroids to be the mean of assigned points
- Repeating until convergence (centroids don't change significantly)
Your implementation should handle 2D points but be extensible to higher dimensions.
Constraints
- Number of points n: 10 ≤ n ≤ 1000
- Number of clusters k: 2 ≤ k ≤ 10
- Points are in 2D space (can extend to n-dimensional)
- Use Euclidean distance
- Maximum 100 iterations or convergence threshold 1e-4
Examples
Example 1
Input:
points = [[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]]
k = 2
Output:
# After convergence:
centroids = [[1.0, 2.0], [10.0, 2.0]]
labels = [0, 0, 0, 1, 1, 1]
# Points [1,2], [1,4], [1,0] in cluster 0
# Points [10,2], [10,4], [10,0] in cluster 1
Explanation: Two clear clusters: points around x=1 and points around x=10
Example 2
Input:
points = [[0, 0], [1, 1], [5, 5], [6, 6]]
k = 2
Output:
centroids = [[0.5, 0.5], [5.5, 5.5]]
labels = [0, 0, 1, 1]
Function Signature
def kmeans(points: List[List[float]], k: int, max_iters: int = 100) -> Tuple[List[List[float]], List[int]]:
"""
Implement K-Means clustering.
Args:
points: List of points, each point is [x, y, ...]
k: Number of clusters
max_iters: Maximum iterations
Returns:
centroids: List of k centroid coordinates
labels: Cluster assignment for each point (0 to k-1)
"""
pass
Estimated Time
35-45 minutes
Tags
clustering unsupervised classic-ml numpy optimization
Your Solution
python
Auto-saves every 30s
Try solving the problem first before viewing the solution
Learning Resources
0:00time spent