📚 PracticeMediumML CodingCoding Ready

Implement K-Means Clustering

ML Coding: ml-algorithms | 35-45 minutes

clusteringunsupervisedclassic-mlnumpyoptimization
Updated Dec 20, 2025

Question

Problem

Implement the K-Means clustering algorithm from scratch. Given a dataset of points and a number k, partition the points into k clusters by iteratively:

  1. Assigning each point to the nearest centroid
  2. Updating centroids to be the mean of assigned points
  3. Repeating until convergence (centroids don't change significantly)

Your implementation should handle 2D points but be extensible to higher dimensions.

Constraints

  • Number of points n: 10 ≤ n ≤ 1000
  • Number of clusters k: 2 ≤ k ≤ 10
  • Points are in 2D space (can extend to n-dimensional)
  • Use Euclidean distance
  • Maximum 100 iterations or convergence threshold 1e-4

Examples

Example 1

Input:

points = [[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]]
k = 2

Output:

# After convergence:
centroids = [[1.0, 2.0], [10.0, 2.0]]
labels = [0, 0, 0, 1, 1, 1]
# Points [1,2], [1,4], [1,0] in cluster 0
# Points [10,2], [10,4], [10,0] in cluster 1

Explanation: Two clear clusters: points around x=1 and points around x=10

Example 2

Input:

points = [[0, 0], [1, 1], [5, 5], [6, 6]]
k = 2

Output:

centroids = [[0.5, 0.5], [5.5, 5.5]]
labels = [0, 0, 1, 1]

Function Signature

def kmeans(points: List[List[float]], k: int, max_iters: int = 100) -> Tuple[List[List[float]], List[int]]:
    """
    Implement K-Means clustering.

    Args:
        points: List of points, each point is [x, y, ...]
        k: Number of clusters
        max_iters: Maximum iterations

    Returns:
        centroids: List of k centroid coordinates
        labels: Cluster assignment for each point (0 to k-1)
    """
    pass

Estimated Time

35-45 minutes

Tags

clustering unsupervised classic-ml numpy optimization


Your Solution

python
Auto-saves every 30s

Try solving the problem first before viewing the solution


0:00time spent