Image Classification

This project is part of my work at the School of Computer Science, University of Birmingham, showcasing my expertise in Machine Learning, Computer Vision, and Data Analysis to turn concepts into practical solutions. Let’s explore how I’ve applied my skills to tackle challenges and create value in image classification.

Image classification is a fundamental task in computer vision, where the goal is to categorize images into predefined labels or categories. This process involves analyzing visual features, such as shapes, colors, and textures, to recognize patterns and assign a class to each image. From recognizing handwritten digits to identifying objects in real-world scenes, image classification has a wide range of applications in fields, including healthcare, e-commerce, and autonomous systems.

Image Classification using SVM or SVC and MLP Machine Learning Models for the Fashion MNIST Dataset
Sample Fashion MNIST Dataset

Project Overview

This project utilizes machine learning and computer vision techniques to classify images from the Fashion MNIST dataset, a popular benchmark for image classification tasks. The dataset consists of grayscale images representing ten categories of clothing items, including T-shirts, dresses, and sneakers.

Each prediction result is displayed in the format "predicted category name (ground-truth category name)". For example, a correct prediction appears as "Shirt (Shirt)", while an incorrect one appears as "Sneaker (Bag)", clearly differentiating correct and incorrect classifications.

Key Objectives

  • Evaluate the effectiveness of different machine learning models, including SVC and MLP, on the Fashion MNIST dataset.
  • Analyze the impact of preprocessing techniques, such as Sobel edge detection and bilateral filtering, on classification performance.
  • Investigate the influence of training data size on improving model accuracy and generalization.

Dataset Description

The Fashion MNIST dataset comprises 70,000 grayscale images, each 28x28 pixels, categorized into ten classes:

  1. T-shirts
  2. Trousers
  3. Pullovers
  4. Dresses
  5. Coats
  1. Sandals
  2. Shirts
  3. Sneakers
  4. Bags
  5. Ankle boots

Each image is labeled with one category, ideal for multi-class classification tasks. The dataset is preprocessed and normalized for consistency during training and testing.

Why Choose Fashion MNIST?

Fashion MNIST is widely regarded as a benchmark dataset for image classification tasks. It is challenging enough to test the capabilities of machine learning models but accessible enough for beginners. Additionally, its focus on fashion-related items adds a touch of real-world applicability, making it relevant for industries like retail and e-commerce.

SVC Classification Results

The Support Vector Classifier (SVC) model was evaluated for its performance in classifying fashion items from the Fashion MNIST dataset. Below are the results based on two different training data sizes.

Image Classification Using SVC or SVM Model with 10,000 Sample Datasets from the Fashion MNIST Datasets
10,000 Training Data and 0.8387 Accuracy

With a smaller training dataset, the SVC model achieves a moderately high accuracy. However, due to limited training data, the model may have less exposure to the variability in the dataset, which can limit its generalization performance.

Image Classification Using SVC or SVM Model with 60,000 Sample Datasets from the Fashion MNIST Datasets
60,000 Training Data and 0.8573 Accuracy

Increasing the training data size improves the accuracy of the SVC model. The larger dataset helps the model learn better representations and patterns, leading to improved performance.

MLP Classification Results

The Multi-Layer Perceptron (MLP), a type of neural network, was also tested for its classification accuracy on the same dataset. Here are the findings.

Image Classification Using MLP Model with 10,000 Sample Datasets from the Fashion MNIST Datasets
10,000 Training Data and 0.8367 Accuracy

Similar to the SVC, the MLP achieves reasonable accuracy with a smaller dataset. However, neural networks typically require more data to perform optimally, which might explain the slightly lower accuracy compared to larger datasets.

mage Classification Using MLP Model with 60,000 Sample Datasets from the Fashion MNIST Datasets
60,000 Training Data and 0.8512 Accuracy

With a larger dataset, the MLP shows improved performance. The additional data helps the neural network adjust its weights more effectively, resulting in better accuracy.

MLP Classification Results after Applying Sobel Edge Detection

The Fashion MNIST dataset was preprocessed using the Sobel edge detection technique, which emphasizes edges and boundaries in the images. This step helps highlight features like contours and shapes, which are critical for distinguishing classes.

MLP Classification Results after Applying Sobel Edge Detection
60,000 Training Data, Sobel Edge Detection, and 0.88 Accuracy

Applying Sobel edge detection improved the accuracy of the MLP classifier compared to the unfiltered dataset. This result suggests that edge-focused preprocessing can enhance the model's ability to capture essential patterns, leading to better classification performance.

MLP Classification Results after Applying Bilateral Filter

A bilateral filter was applied to the dataset. This preprocessing step smoothens the images while preserving edges, reducing noise without losing critical features.

MLP Classification Results after Applying Bilateral Filter
60,000 Training Data, Bilateral Filter, and 0.82 Accuracy

While the bilateral filter helps reduce noise and enhance clarity, the resulting accuracy is slightly lower (0.82). This suggests that while edge preservation is valuable, excessive smoothing might have led to a loss of subtle details that the model needs for optimal performance.

Conclusion and Decision

  • Training Data Size: Increasing the training data size from 10,000 to 60,000 improves accuracy across all models.
  • Model Choice: The MLP model, when combined with preprocessing, outperforms SVC, achieving the highest accuracy with Sobel edge detection.
  • Preprocessing Choice: Sobel edge detection is the most effective preprocessing technique, boosting MLP accuracy to 0.88.
  • Practical Insight: Preprocessing techniques that highlight important features, such as edges, significantly enhance model performance in image classification tasks.