Image Data Preprocessing Techniques You Should Know

Image data preprocessing is a crucial step in any image-related Machine Learning task. As a Data Scientist and Machine Learning Engineer, you should know how to preprocess images, especially in companies where working with image data is crucial. So, if you want to learn image preprocessing techniques, this article is for you. In this article, I’ll take you through the essential image data preprocessing techniques you should know with implementation using Python.

Image Data Preprocessing Techniques

Here are some essential image data preprocessing techniques you should know:

  1. Loading and Preparing Images
  2. Resizing Images
  3. Normalizing Pixel Values
  4. Data Augmentation
  5. Histogram Equalization

Let’s go through all these image data preprocessing techniques in detail with implementation using Python.

Note: To implement all these techniques, I will use a dataset of Women’s Fashion. You can download the dataset from here.

Loading and Preparing Images

Loading and preparing images involves reading image files from a storage medium into memory using libraries such as PIL (Python Imaging Library) or OpenCV. This process converts the images into a format easily manipulated and analyzed, typically as multi-dimensional arrays. Properly loading images ensures that subsequent preprocessing steps can be applied uniformly across the dataset to maintain data integrity and consistency.

Here’s how to load and prepare images using Python (download the dataset I am using from here):

import zipfile
import os
from PIL import Image
import matplotlib.pyplot as plt

# define the path to the uploaded zip file
zip_file_path = '/content/women-fashion.zip'
extract_folder_path = '/content/women fashion'

# extract the zip file
with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
    zip_ref.extractall(extract_folder_path)

# define function to load images
def load_images(image_folder_path, image_files):
    images = []
    for file_name in image_files:
        if file_name.endswith(('.jpg', '.jpeg', '.png', '.webp')):
            img_path = os.path.join(image_folder_path, file_name)
            img = Image.open(img_path).convert('RGB')
            images.append(img)
    return images

# load images
image_folder_path = os.path.join(extract_folder_path, 'women fashion')
image_files = [file for file in os.listdir(image_folder_path) if file.lower().endswith(('jpg', 'jpeg', 'png', 'webp'))]
images = load_images(image_folder_path, image_files)

# display original images
plt.figure(figsize=(10, 5))
for i in range(5):
    plt.subplot(1, 5, i + 1)
    plt.imshow(images[i])
    plt.axis('off')
plt.suptitle('Original Images')
plt.show()
Image Data Preprocessing Techniques: loading and preparing images

In the above code, we are extracting a zip file containing images, then loading these images into memory, and then displaying the first five images using matplotlib for visualization. Initially, we specify the path to the zip file and the extraction folder, then extract the contents of the zip file into the specified directory.

Next, we defined a function to load images by iterating through the list of image files to check their file extensions and converting each image to RGB format using the PIL library. After loading the images, we used matplotlib to create a figure and plot the first five images in a grid layout, which turns off the axis labels for clarity and adds a title for the entire figure. This process helps in visually verifying the contents and quality of the loaded images.

Resizing Images

Resizing images changes their dimensions (height and width) to a standard size, which is crucial for ensuring uniform input sizes for machine learning models. This step is important because most neural networks require fixed input dimensions. Resizing can be done using various interpolation methods, and it helps reduce computational load and memory usage, which makes the processing pipeline more efficient while retaining the essential features of the images.

Here’s how to resize images using Python:

# function to resize images
def resize_images(images, size=(128, 128)):
    resized_images = [img.resize(size) for img in images]
    return resized_images

# resize images
resized_images = resize_images(images)

# display resized images
plt.figure(figsize=(10, 5))
for i in range(5):
    plt.subplot(1, 5, i + 1)
    plt.imshow(resized_images[i])
    plt.axis('off')
plt.suptitle('Resized Images')
plt.show()
Resized Images

In the above code, we are defining and using a function to resize a list of images to a standard dimension of 128×128 pixels. The function resize_images iterates over each image in the provided list to resize them using the resize method from the PIL library and then stores the resized images in a new list.

Normalizing Pixel Values

Normalizing pixel values scales the pixel intensity values of an image to a specific range, typically [0, 1] or [-1, 1]. This is achieved by dividing the pixel values by the maximum possible value (e.g., 255 for an 8-bit image). Normalization is essential for improving the convergence speed of machine learning models during training, as it ensures that the input features have a consistent scale, which stabilizes the learning process and helps in achieving better performance.

Here’s how to normalize the pixel values of images:

import numpy as np

# function to normalize images
def normalize_images(images):
    normalized_images = [np.array(img) / 255.0 for img in images]
    return np.array(normalized_images)

# normalize images
normalized_images = normalize_images(resized_images)

# print the array of original vs normalized images for the first image
print("Original Image Array:")
print(np.array(resized_images[0]))

print("\nNormalized Image Array:")
print(normalized_images[0])
Original Image Array:
[[[ 38 43 26]
[ 44 54 35]
[ 25 38 18]
...
[ 88 92 54]
[ 77 82 40]
[ 60 67 23]]

[[ 42 47 30]
[ 34 44 25]
[ 27 40 20]
...
[ 79 85 45]
[ 49 56 19]
[ 44 51 9]]

[[ 28 33 17]
[ 23 33 15]
[ 20 33 14]
...
[ 71 80 39]
[ 68 77 32]
[ 76 84 43]]

...

[[119 129 138]
[120 130 139]
[119 129 138]
...
[ 84 100 115]
[ 80 96 111]
[ 78 95 110]]

[[111 121 130]
[112 122 131]
[119 129 138]
...
[ 86 101 117]
[ 86 101 117]
[ 81 98 113]]

[[112 122 131]
[103 113 122]
[111 121 130]
...
[ 85 102 117]
[ 82 99 113]
[ 81 98 113]]]

Normalized Image Array:
[[[0.14901961 0.16862745 0.10196078]
[0.17254902 0.21176471 0.1372549 ]
[0.09803922 0.14901961 0.07058824]
...
[0.34509804 0.36078431 0.21176471]
[0.30196078 0.32156863 0.15686275]
[0.23529412 0.2627451 0.09019608]]

[[0.16470588 0.18431373 0.11764706]
[0.13333333 0.17254902 0.09803922]
[0.10588235 0.15686275 0.07843137]
...
[0.30980392 0.33333333 0.17647059]
[0.19215686 0.21960784 0.0745098 ]
[0.17254902 0.2 0.03529412]]

[[0.10980392 0.12941176 0.06666667]
[0.09019608 0.12941176 0.05882353]
[0.07843137 0.12941176 0.05490196]
...
[0.27843137 0.31372549 0.15294118]
[0.26666667 0.30196078 0.1254902 ]
[0.29803922 0.32941176 0.16862745]]

...

[[0.46666667 0.50588235 0.54117647]
[0.47058824 0.50980392 0.54509804]
[0.46666667 0.50588235 0.54117647]
...
[0.32941176 0.39215686 0.45098039]
[0.31372549 0.37647059 0.43529412]
[0.30588235 0.37254902 0.43137255]]

[[0.43529412 0.4745098 0.50980392]
[0.43921569 0.47843137 0.51372549]
[0.46666667 0.50588235 0.54117647]
...
[0.3372549 0.39607843 0.45882353]
[0.3372549 0.39607843 0.45882353]
[0.31764706 0.38431373 0.44313725]]

[[0.43921569 0.47843137 0.51372549]
[0.40392157 0.44313725 0.47843137]
[0.43529412 0.4745098 0.50980392]
...
[0.33333333 0.4 0.45882353]
[0.32156863 0.38823529 0.44313725]
[0.31764706 0.38431373 0.44313725]]]

In the above code, we are defining and using a function to normalize the pixel values of a list of images to scale them from the range of [0, 255] to [0, 1]. The normalize_images function iterates over each image in the provided list, whichconverts them to NumPy arrays, and divides each pixel value by 255.0 to achieve normalization. This normalized list of images is then converted into a single NumPy array.

Data Augmentation

Data augmentation involves creating new training samples from the existing dataset through various transformations such as rotations, translations, scaling, and flipping. This technique artificially increases the size of the training dataset, which helps to prevent overfitting and improve the generalization capability of machine learning models. By exposing the model to a wider variety of augmented data, it learns to recognize patterns more robustly, which leads to better performance on unseen data.

Here’s how to perform data augmentation using Python:

from keras.preprocessing.image import ImageDataGenerator

# define data augmentation generator
datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

# example: augment a single image
sample_image = normalized_images[0]
sample_image = np.expand_dims(sample_image, axis=0)
aug_iter = datagen.flow(sample_image)

# display augmented images
plt.figure(figsize=(10, 5))
for i in range(5):
    plt.subplot(1, 5, i + 1)
    batch = next(aug_iter)
    image_aug = batch[0]
    plt.imshow(image_aug)
    plt.axis('off')
plt.suptitle('Augmented Images')
plt.show()
Image Data Preprocessing Techniques: Augmented Images

In the above code, we are setting up and applying data augmentation to an image using the ImageDataGenerator class from Keras. The data augmentation generator, datagen, is defined with various transformation parameters such as rotation, width and height shifts, shearing, zooming, and horizontal flipping to artificially expand the training dataset.

We then select a single normalized image from the list, expand its dimensions to match the expected input shape of the generator, and create an iterator, aug_iter, to generate augmented versions of the image. Then, we are displaying five augmented versions of this sample image in a grid layout by repeatedly fetching the next augmented image from the iterator and plotting it.

Histogram Equalization

Histogram equalization is a contrast enhancement technique that redistributes the intensity values of an image to span the entire range of the histogram. This process enhances the contrast by making the distribution of pixel intensities more uniform, which highlights features and edges more clearly. Histogram equalization is particularly useful for improving the visual quality of images with poor contrast and can enhance the performance of image recognition algorithms by providing more distinct features.

Here’s how to perform histogram equalization using Python:

import cv2
# function to apply histogram equalization
def histogram_equalization(img):
    img_yuv = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2YUV)
    img_yuv[:, :, 0] = cv2.equalizeHist(img_yuv[:, :, 0])
    img_equalized = cv2.cvtColor(img_yuv, cv2.COLOR_YUV2RGB)
    return Image.fromarray(img_equalized)

# apply histogram equalization to all images
equalized_images = [histogram_equalization(img) for img in resized_images]

# display histogram equalized images
plt.figure(figsize=(10, 5))
for i in range(5):
    plt.subplot(1, 5, i + 1)
    plt.imshow(equalized_images[i])
    plt.axis('off')
plt.suptitle('Histogram Equalized Images')
plt.show()
Histogram Equalized Images

In the above code, we are defining and using a function to apply histogram equalization to a list of images to enhance their contrast. The histogram_equalization function converts an image from RGB to YUV colour space to perform histogram equalization on the Y (luminance) channel and then converts the image back to RGB colour space. This process is applied to each image in the list of resized images to create a new list of contrast-enhanced images.

Summary

So, these are some image data preprocessing techniques you should know. Image data preprocessing is a crucial step in any image-related Machine Learning task. As a Data Scientist and Machine Learning Engineer, you should know how to preprocess images, especially in companies where working with image data is crucial.

I hope you liked this article on image data preprocessing techniques you should know as a Data Scientist and a Machine Learning Engineer. Feel free to ask valuable questions in the comments section below. You can follow me on Instagram for many more resources.

Aman Kharwal
Aman Kharwal

AI/ML Engineer | Published Author. My aim is to decode data science for the real world in the most simple words.

Articles: 2074

Leave a Reply

Discover more from AmanXai by Aman Kharwal

Subscribe now to keep reading and get access to the full archive.

Continue reading