Artificial Intelligence / Machine Learning

Image Processing

Meghana

Nov 28, 2023 • 6 min read

Introduction:

Images are all around us in the digital world we live in, from pictures and films to X-rays and satellite imagery. However, to extract useful information, enhance visual quality, and facilitate informed decision-making, raw images frequently need to be refined and analyzed. Image processing is vital in this situation. A vast array of techniques are used in the field of image processing to edit and improve digital images, which opens up a wide range of applications in industries including entertainment, surveillance, and medicine.

Image Acquisition:

Acquisition of the image data is the first stage of an image pipeline. To do this, either load the image from a file or take a picture with a camera. Subsequent processing steps are built upon the raw image data.

Code:

import cv2

Create a VideoCapture object

cap = cv2.VideoCapture(0) # 0 represents the default camera device

Check if the camera was opened successfully

if not cap.isOpened():
print("Failed to open the camera")
exit()

Read and display frames from the camera until the user presses 'q'

while True:
# Capture frame-by-frame
ret, frame = cap.read()
# If the frame was not captured successfully, break the loop
if not ret:
print("Failed to capture frame")
break
# Display the captured frame
cv2.imshow('Frame', frame)
# Wait for the user to press 'q' to exit
if cv2.waitKey(1) & 0xFF == ord('q'):
break

Release the VideoCapture object and close the windows

cap.release()
cv2.destroyAllWindows()

Preprocessing:

Preprocessing is the process of getting the acquired image ready for more analysis. To standardise the image data and get rid of unwanted artefacts, common preprocessing techniques include resizing, cropping, and adjusting brightness and contrast. The image is ready for later processing steps at this stage, which also ensures consistency.

Code:

import cv2
import numpy as np
def preprocess_image(image_path):
# Load the image
image = cv2.imread(image_path)
# Check if the image was loaded successfully
if image is None:
print("Failed to load the image")
return None
# Convert the image to grayscale
grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Apply Gaussian blur to reduce noise
blurred_image = cv2.GaussianBlur(grayscale_image, (5, 5), 0)
# Apply thresholding to create a binary image
, binaryimage = cv2.threshold(blurred_image, 127, 255, cv2.THRESH_BINARY)
# Apply morphological operations to remove noise and improve the shape of objects
kernel = np.ones((5, 5), np.uint8)
opened_image = cv2.morphologyEx(binary_image, cv2.MORPH_OPEN, kernel, iterations=1)
# Perform any additional preprocessing steps as needed
return opened_image

Enhancement and Filtering:

To enhance the image's quality or draw attention to certain features, enhancement and filtering techniques are used. These methods consist of blurring, sharpening, and contrast enhancement. In later stages, it is simpler to extract useful information when the image has been improved.

Code:

import cv2
import numpy as np
def enhance_image(image_path):
# Load the image
image = cv2.imread(image_path)
# Check if the image was loaded successfully
if image is None:
print("Failed to load the image")
return None
# Convert the image to grayscale
grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Apply histogram equalization to enhance the contrast
equalized_image = cv2.equalizeHist(grayscale_image)
# Apply bilateral filter to reduce noise while preserving edges
filtered_image = cv2.bilateralFilter(equalized_image, 9, 75, 75)
# Perform any additional enhancement or filtering techniques as needed
return filtered_image

Feature Extraction:

The extraction of relevant data or features from an image is known as feature extraction, and it is a crucial step. To find and extract significant elements from the image, methods like edge detection, object detection, segmentation, and pattern recognition are used. The foundation for further analysis or recognition tasks is laid at this stage.

Code:

import cv2
def extract_features(image_path):
# Load the image
image = cv2.imread(image_path)
# Check if the image was loaded successfully
if image is None:
print("Failed to load the image")
return None
# Convert the image to grayscale
grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Create a feature detector (e.g., ORB, SIFT, SURF)
feature_detector = cv2.ORB_create()
# Detect keypoints and compute descriptors
keypoints, descriptors = feature_detector.detectAndCompute(grayscale_image, None)
# Return the keypoints and descriptors
return keypoints, descriptors

Transformation:

Geometric operations on the image, such as rotation, scaling, or warping, are referred to as transformation. These operations change the viewpoint, align the objects, or correct the perspective.

Code:

import cv2
import numpy as np
def transform_image(image_path):
# Load the image
image = cv2.imread(image_path)
# Check if the image was loaded successfully
if image is None:
print("Failed to load the image")
return None
# Convert the image to grayscale
grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Resize the image
resized_image = cv2.resize(grayscale_image, None, fx=0.5, fy=0.5, interpolation=cv2.INTER_LINEAR)
# Rotate the image
rows, cols = resized_image.shape[:2]
rotation_matrix = cv2.getRotationMatrix2D((cols/2, rows/2), 30, 1)
rotated_image = cv2.warpAffine(resized_image, rotation_matrix, (cols, rows))
# Perform any additional image transformations as needed
return rotated_image

Classification or Recognition:

In this step, objects or patterns within the image are classified or recognised using machine learning or computer vision algorithms. It may cover activities like image classification, object recognition, and facial recognition. The image pipeline can offer useful insights and automated analysis by utilising trained models.

Code:

import tensorflow as tf
import numpy as np
import cv2

Load the pre-trained model

model = tf.keras.applications.MobileNetV2(weights='imagenet')

Load and preprocess the image

def preprocess_image(image_path):
image = cv2.imread(image_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (224, 224))
image = tf.keras.applications.mobilenet_v2.preprocess_input(image)
image = np.expand_dims(image, axis=0)
return image

Classify the image

def classify_image(image_path):
image = preprocess_image(image_path)
predictions = model.predict(image)
class_index = np.argmax(predictions)
class_label = tf.keras.applications.mobilenet_v2.decode_predictions(predictions, top=1)[0][0][1]
return class_label

Post Processing:

After the initial processing steps, further post-processing can be done to polish the outcomes or make particular adjustments. Colour grading, image composition, noise reduction, and annotations are all examples of post-processing. This step makes sure the final product satisfies the desired specifications or aesthetic standards.

Code:

import cv2
import numpy as np
def post_process_segmentation(segmentation_mask):
# Remove small noise regions
filtered_mask = cv2.morphologyEx(segmentation_mask, cv2.MORPH_OPEN, np.ones((3, 3), np.uint8))
# Find connected components and remove small regions
_, labels, stats, _ = cv2.connectedComponentsWithStats(filtered_mask)
# Create a mask with only the largest connected component
largest_component_label = np.argmax(stats[1:, cv2.CC_STAT_AREA]) + 1
largest_component_mask = np.uint8(labels == largest_component_label)
# Apply morphological closing to fill in gaps
filled_mask = cv2.morphologyEx(largest_component_mask, cv2.MORPH_CLOSE, np.ones((5, 5), np.uint8))
# Apply Gaussian blur to smooth the edges
blurred_mask = cv2.GaussianBlur(filled_mask, (5, 5), 0)
return blurred_mask

Conclusion:

The way we engage with and get knowledge from visual data has been revolutionised by image processing. Images are converted into useful information by several processes, including acquisition, preprocessing, segmentation, feature extraction, augmentation, recognition/classification, reconstruction, and post-processing. Image processing has a wide range of uses, from surveillance systems and medical diagnosis to the entertainment and creative sectors. Image processing is still being done as algorithms and technology improve.

Do Checkout:

The link to our product named AIensured offers explainability and many more techniques.

To know more about explainability and AI-related articles please visit this link.

References:

Thanusree

Introduction:

Image Acquisition:

Create a VideoCapture object

Check if the camera was opened successfully

Read and display frames from the camera until the user presses 'q'

Release the VideoCapture object and close the windows

Preprocessing:

Enhancement and Filtering:

Feature Extraction:

Transformation:

Classification or Recognition:

Load the pre-trained model

Load and preprocess the image

Classify the image

Post Processing:

Conclusion:

Do Checkout:

References:

Sign up for more like this.