# K-Means and Image Processing

Suppose we take a photo of our notes on a white paper. Due to the lighting conditions, the white paper may appear grey in the image. Our objective is to remove the background color (which is grey in our case).

It turns out that we can simply apply the K-means algorithms to the image directly. Most of the pixels in the image are close to the white color and our writings only occupy a small fraction of the image. Therefore, we have two different types of pixels in the image and we only need a classification algorithm to add labels to the pixels. If a pixel belongs to "paper", then we will set the pixel to white.

#### Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import cv2
import numpy as np
from sklearn.cluster import KMeans
import math

WHITE_COLOR = np.asarray((255, 255, 255))
BLACK_COLOR = (0, 0, 0)

def calculateDistance(c1, c2):
a = c1[0] - c2[0]
b = c1[1] - c2[1]
c = c1[2] - c2[2]
return math.sqrt(a * a + b * b + c * c)

def applyKMeans(image_in, k):
image = np.copy(image_in)
X = image.reshape(-1, 3)
kmeans = KMeans(n_clusters=k).fit(X)

labels = kmeans.labels_.reshape(nrows, ncols)

# Find the label for black color
distanceToBlackColor = list(map(lambda c: calculateDistance(c, BLACK_COLOR), kmeans.cluster_centers_))
labelForBlackColor = np.argmin(distanceToBlackColor)

for i in range(nrows):
for j in range(ncols):
if labels[i,j] != labelForBlackColor:
image[i,j,:] = WHITE_COLOR
return cv2.blur(image, (2,2))

imageFile = '<path to source file>'
nrows, ncols, _ = sourceImage.shape

processedImage = applyKMeans(sourceImage, 2)

cv2.imshow('image', processedImage)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)


#### Output

----- END -----

Want some fun stuff?