K-means clustering

February 27, 2019

The main purpose of this project was to compress an image via the clustering technique K-means. This was done by taking into account only colours that occur the most in an image. K-means is a known unsupervised machine learning technique for clustering similar data.

The algorithm was first tested in a 2-dimensional dataset, which is visualised in Figure 1.

After that, K-means clustering is applied and after 10 iterations, all data points are assigned to three centroids. The number of centoids is arbitrarily chosen. The process of centroids calculation is presented in Figure 2.

Furthermore, the algorithm was tested as an image compression tool. The colours of the original imported image were clustered in 16 centroids, resulting in a compressed images with 16 colours. These images are shown in Figure 3.

Figure 3: Image compression by applying k-means clustering

A more detailed description of this project’s implentation in Matlab can be seen in this github repository: Link to Github repository

Photo Credits

machine learning

K-means clustering

Image compression

K-means clustering

Figure 1: 2-dimensional dataset

Figure 2: process of clusters creation

Figure 3: Image compression by applying k-means clustering