OpenCV - Image Operations

·

4 min read

In this blog, I want to share what I learned this week as the beginning of my machine learning, computer vision, and generative adversarial network(GANs) journey. I began this new journey by taking the online Udemy course called: "Modern Computer Vision™ PyTorch, Tensorflow2 Keras & OpenCV4"

As for now, the course is mainly focused on the OpenCV library, as it is so far the most powerful and customizable. Library for image operations.

Color

One thing that I learned from this course is that an image is usually made up of arrays, other than grayscale images, which consist of RGB, R stands for red, G stands for Green, and B stands for blue. The array contains the width, height, and depth (which are the RGB colors). Also, interestingly, I found that when you read an image using OpenCV, it is by default that it reads in a BGR order instead of RGB, which will make the image look so weird to us since the B and R are switched. Therefore, in order to show the image, I have to first import the libraries, and also convert the colors of the images. As the following example:

import cv2
from matplotlib import pyplot as plt

def imshow(title="", image=None, size=10):
    w, h = image.shape[:2]
    # this step is to keep the image in the same ratio when you change the size of the image
    ratio = w/h
    plt.figure(figsize=(ratio*size, size))

    # I can also do the turn it into grayscale image by using cv2.COLOR_RGB2GRAY
    plt.imshow(cv2.cvtColor(image, cv2.COLOR_RGB2BGR))
    plt.title(title)
    plt.show()

Unlike RGB, which has 3 dimensions. Grayscale image only has 1 dimension. Therefore, grayscale images, in comparison, will save more space. The image can also be separated into 3 layers consist of R, G, and B by doing the following:

imshow("Red", cv2.merge([zeros, zeros, B]))
imshow("Red", cv2.merge([zeros, G, zeros]))
imshow("Red", cv2.merge([R, zeros, zeros]))

# You can also merge them back to get back the orginal image
merged = cv2.merge([R, B, G])
imshow("Orginal", merged)

HSV Image

There is also another way to represent the color, which is HSV stands for Hue, Saturation, and Value

Hue: Is basically different colors

Saturation: is how rich the color is. The smaller the value is, the paler the color is. The higher the value is, the richer the color is.

Value: The value is basically the brightness of the color. The smaller the number is, the darker it is.

However, the plotting function in matplotlib is designed to be used on RGB, not HSV. We can also look into Hue, Saturation, and Value separately by doing the following:

imshow("Hue", hsv_image[:, :, 0])
imshow("Saturation", hsv_image[:, :, 1])
imshow("Value", hsv_image[:, :, 2])

Arithmetic Operations

These are simple operations that allow us to directly add or subtract to the color intensity.

# We should not just adding the value of the RBG direcly by using +, -, instead uset he cv.add.subtract function
# When we just use +/-, it may go over the 255 range
added = cv2.add(image, M)
imshow("Incraseing Brightness", added)

Drawing

We can also perform drawing on the image

##

square = np.zeros((300, 300), np.uint8)
cv2.rectangle(square, (50, 50), (250, 250), 255, -2)
imshow("square", square)

ellipse = np.zeros((300, 300), np.uint8)
cv2.ellipse(ellipse, (150, 150), (150, 150), 30, 0, 180, 255, -1)
imshow("ellipse", ellipse)

Bitwise Operation and Masking

And = cv2.bitwise_and(square, ellipse)
imshow('AND', And)

bitwiseOr = cv2.bitwise_or(square, ellipse)
imshow("Or", bitwiseOr)

bitwiseXor = cv2.bitwise_xor(square, ellipse)
imshow("Xor", bitwiseXor)

bitwiseNot_sq = cv2.bitwise_not(square)
imshow("Not Square", bitwiseNot_sq)

Convolution, Blurring and Sharpening Images

Some commonly used blurring methods in OpenCV

  • Regular Blurring

  • Gaussian Blurring

  • Medina Blurring

blur = cv2.blur(image, (5,5))
imshow('Averaging', blur)

Gaussian = cv2.GaussianBlur(image, (5,5), 0)
imshow('Caussian Blurring', Gaussian)

median = cv2.medianBlur(image, 5)
imshow('Median Blurring', median)

Image De-noising

There are 4 variations of Non-Local Means Denoising:

  • cv2.fastNlMeansDenoising() - Works with single grayscale images

  • cv2.fastNlMeansDenoisingColored() - Works with a color image

  • cv2.fastNiMeansDenoisingMulti() - Works with image sequence captured in a short period of time (grayscale images)

  • cv2.fastNlMeansDenoisingColoredMulti() - Same as above. but for color images.

fastNlMeansDenoisingColored(ImputArray src, OutputArray dst, float h=3, hColor=3, int templateWindowSize=7, int searchWindowSize=21)

image = cv2.imread('images/hilton.jpeg')
imshow('Original', image)

dst = cv2.fastNlMeansDenoisingColored(image, None, 6, 6, 7, 21)
imshow('fastMeansDenoisingColored', dst)

Sharpening

image = cv2.imread('images/hilton.jpeg')
imshow('Orginal', image)

kernel_sharpening = np.array([[-1, -1, -1,],
                              [-1, 9, -1],
                              [-1, -1, -1]])

sharpened = cv2.filter2D(image, -1, kernel_sharpening)
imshow('Sharpened Image', sharpened)

Conclusion

Those are what I have learned this week. However, there are some questions that I still can't figure out the answer. Such as when we doing denoising,

What exactly is the machine trying to do or delete to make the image look cleaner?

I'm also not sure why we need those arrays and numbers for sharpening and Blurring. How are those work?