Last Updated on 13/10/2025 by Eran Feit
Introduction
Image segmentation is the process of partitioning an image into meaningful regions (segments) — for example separating foreground from background, isolating objects, or splitting an image into zones of similar color or texture. One classic and relatively simple technique for segmentation is K-means clustering applied in color (or feature) space.
In this post, we’ll explore:
- What is K-means clustering?
- How K-means applies to image segmentation
- Step-by-step Python + OpenCV code for segmentation with K-means
- Tips, pitfalls, and practical extensions
- A Q&A section to clarify key points
Whether you’re a beginner in computer vision or looking to deepen your understanding, this will guide you from concept through implementation.
What Is K-Means Clustering?

K-means is an unsupervised machine learning algorithm widely used for clustering data points. The goal is:
- Choose K cluster centers (called “centroids”) in the data space.
- Assign each data point to the nearest centroid (based on some distance metric, usually Euclidean).
- Recompute each centroid as the mean of assigned points.
- Repeat steps 2 and 3 until convergence (assignments no longer change or minimal change in centroids).
Mathematically, it minimizes the sum of squared distances from each point to its assigned cluster center.
Key points:
- Unsupervised: requires no labeled data.
- Hyperparameter K: you must decide the number of clusters.
- Works best when clusters are roughly spherical in the feature space.
Because images can be represented by pixel features (e.g. color vectors), K-means can cluster pixels into groups of similar color or appearance, making it a natural tool for segmentation.
How K-Means Relates to Image Segmentation
In image segmentation with K-means:
- Feature Representation
You represent each pixel by features such as its RGB (or BGR) values, or you can expand to include spatial coordinates, texture features, or other modalities. - Clustering Pixels
You cluster the pixel feature vectors into K groups. Each cluster corresponds to a segment — pixels in the same cluster are assumed to be “similar.” - Reconstructing the Segmented Image
You map each pixel to some representative color (e.g. the cluster centroid) or mask out clusters you don’t want.
Advantages:
- Simple, fast (for moderate image sizes).
- No need for ground truth labels.
- Good for color-based segmentation when objects differ primarily by color.
Limitations:
- Choosing K can be tricky.
- Sensitive to initialization (you might get poor local minima).
- Doesn’t incorporate higher-level semantics (it won’t “understand” what an object is).
Here is the video tutorial for Image Segmentation Made Easy with OpenCV and K-means Algorithm:
Check the video tutorial here : https://www.youtube.com/watch?v=a2Kti9UGtrU
You can find more similar tutorials in my blog posts page here : https://eranfeit.net/blog/
K-Means Segmentation in OpenCV with Python
This section walks you through loading an image, reshaping pixels, configuring K-Means criteria, clustering, and reconstructing the segmented output.
The code block contains short explanations directly above each Python command for copy-paste convenience.
### Import OpenCV for image processing and NumPy for array manipulation. import cv2 import numpy as np ### Read the input image from disk using OpenCV's imread function. ### The image path should point to a valid file on your system. img = cv2.imread("Open-CV/Image-Segmentation-Using-K-means/beach-and-boats.jpeg") ### Print the original image shape to understand height, width, and channel count. print(img.shape) ### Reshape the image to a 2D array where each row is a pixel and columns are color channels. ### This converts (H, W, 3) into (H*W, 3) for K-Means input. twoDim = img.reshape((-1,3)) print(twoDim.shape) ### Convert the reshaped pixel array to 32-bit float as required by cv2.kmeans. twoDim = np.float32(twoDim) ### Define the termination criteria for K-Means: either a small epsilon change or a max number of iterations. ### TERM_CRITERIA_EPS stops when the specified accuracy is reached. ### TERM_CRITERIA_MAX_ITER stops when the iteration limit is reached. criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER , 10 , 1.0) ### Choose the number of clusters K to control how many color segments you want. ### Higher K yields more detailed segmentation, lower K yields simpler regions. K=3 ### Set the number of attempts to run K-Means with different initial labels. ### The best solution (lowest compactness) across attempts is returned. attamps = 10 ### Run OpenCV's K-Means on the pixel data to obtain cluster centers and labels for each pixel. ### KMEANS_PP_CENTERS uses K-Means++ initialization for better centroid seeds. ret , label , center = cv2.kmeans(twoDim, K, None , criteria , attamps, cv2.KMEANS_PP_CENTERS) ### Convert the cluster centers back to 8-bit integers to map them to an image. center = np.uint8(center) ### Replace each pixel by the centroid color of its assigned cluster for a segmented look. res = center[label.flatten()] ### Reshape the flat result back to the original image dimensions. result_image = res.reshape((img.shape)) ### Display the segmented result in a window titled "result_image". cv2.imshow("result_image",result_image) ### Save the segmented image to disk for later use or sharing. cv2.imwrite("c:/temp/result3-100.jpg",result_image) ### Optionally display the original image for visual comparison. cv2.imshow("img",img) ### Wait for a key press so the windows remain visible until you close them. cv2.waitKey(0) ### Clean up any OpenCV windows that were opened during display. cv2.destroyAllWindows() You can find the full code here : https://ko-fi.com/s/c7e8e45f56
Explanation of Key Parts
- We flatten the 2D image into a 2D array of shape (num_pixels × 3) so we can feed it to
cv2.kmeans. - We use
cv2.KMEANS_PP_CENTERSfor better initialization (kmeans++). labelsgives the cluster index for each pixel;centersgives the colors of the clusters.- We rebuild the segmented image by replacing each pixel with its centroid color.
You can extend this to:
Use it as a preprocessing step before further processing (object recognition,
Use different feature spaces (e.g. RGB + x, y coordinates).
Mask out specific clusters (e.g. “pick the cluster corresponding to object color”).
Advanced Extensions & Tips
Here are ways to make K-means segmentation more powerful or robust:
- Add spatial information
Append the (x, y) coordinates (normalized) to each pixel’s feature vector. This encourages spatial coherence (neighboring pixels likely join same cluster). - Use other features
Rather than RGB, you can use HSV, Lab color space, or texture descriptors (e.g. local binary patterns) to better separate based on what matters. - Post-processing
Use morphological operations (dilation, erosion) to clean up small noisy regions. - Cluster hierarchically / multiple scales
Use coarse K-means clustering followed by refinement (e.g. apply a smaller K inside a cluster of interest). - Initialization and stability
Run multiple runs (attempts) and pick the one with lowest compactness. Use kmeans++ initialization to reduce sensitivity to starting points. - Automatic K selection
Use the “elbow method” or silhouette score to guide the selection of K. - Combine with deep learning
Use K-means to cluster embeddings or feature maps extracted by a neural network (e.g. for segmentation tasks in a weakly supervised setting).
FAQ
What is K-means in simple terms?
K-means is an unsupervised clustering algorithm that partitions data into K groups by assigning points to the nearest centroid and updating those centroids iteratively.
How does K-means apply to image segmentation?
Each pixel is a data point (e.g., color values). K-means groups similar pixels into clusters, and each cluster can be treated as a segment.
How do I choose K (number of clusters)?
Use domain knowledge or heuristics like the elbow method or silhouette score to balance detail with stability.
What is compactness in OpenCV’s kmeans?
It’s the sum of squared distances from points to their assigned centroids — lower usually means tighter, better clusters.
Why use kmeans++ initialization?
It picks smarter starting centroids, improving convergence and reducing the chance of poor local minima.
Common pitfalls with K-means segmentation?
Choosing a bad K, ignoring spatial continuity (noisy, speckled segments), and relying on color when objects overlap heavily in color space.
How do I add spatial smoothness?
Append normalized (x, y) to pixel features or post-process with morphology/CRFs to enforce neighborhood coherence.
When is K-means not suitable?
When semantics matter (e.g., “car vs. road”) or when object/background colors are very similar.
Can I use features beyond color?
Yes — try HSV/Lab, texture descriptors, gradients, depth, or neural embeddings for better separability.
How to improve stability and quality?
Run multiple attempts, use kmeans++, try different K values, and clean results with morphological ops.
Wrapping Up — Why K-Means Still Matters in Image Segmentation
K-means remains one of the simplest yet most effective techniques for color-based image segmentation.
It’s easy to implement, computationally efficient, and a great starting point for anyone learning about computer vision and clustering.
Even though modern segmentation methods now rely on deep learning architectures like U-Net, Mask R-CNN, or SAM2, K-means continues to serve as a valuable baseline for quick prototyping and data exploration.
Its power lies in its intuitive logic — grouping similar pixels together — which often provides surprisingly strong results in scenes with clear color differences or uniform textures.
By mastering K-means segmentation, you not only gain a hands-on understanding of how clustering works in practice, but you also build intuition that applies to more advanced models.
It’s a gateway to understanding how machines “see” structure in visual data, and it can be extended with creativity — whether through better feature engineering, hybrid approaches with deep learning, or post-processing filters for refinement.
So next time you need a fast, explainable, and reliable segmentation method — or simply want to visualize the inner structure of an image — don’t overlook K-means.
It’s a timeless algorithm that continues to bridge the gap between mathematical simplicity and visual understanding.
Connect :
☕ Buy me a coffee — https://ko-fi.com/eranfeit
🖥️ Email : feitgemel@gmail.com
🤝 Fiverr : https://www.fiverr.com/s/mB3Pbb
Enjoy,
Eran
