ML Programming Project · Bournemouth NCCA · 2024

Animal Classification

A convolutional neural network trained from scratch to classify animal photographs. 91.43% test accuracy across 7 categories.

Results

Output

Batch predictions on the test set. Green = correct, red = misclassified.

About

What It Does

The model takes a photograph of an animal and predicts which of seven categories it belongs to: squirrel, lion, horse, elephant, chicken, camel, or bear. It was trained on 2,392 images with no pre-trained weights. Everything was built from scratch.

A CNN works by sliding small filters across the image to detect visual features. Early layers pick up edges and textures. Deeper layers combine those into shapes and patterns the model uses to tell a horse from a camel. Three convolutional layers (32→64→128 filters) progressively build up that understanding.

Approach

Key Decisions

Vision Transformers were considered but need large-scale data to train without pre-trained weights. With only 2,392 images, CNN was the right architecture. To stretch a small dataset, augmentations like random rotation, colour jitter, and RandomErasing simulate variety the model wouldn't otherwise see. BatchNorm stabilises training, AdamW separates weight decay from gradient updates for better generalisation, and 50% dropout forces the model to learn redundant features rather than memorising.

Seven iterations refined the model from a 58% baseline to 91.43%. Each version isolated a specific variable: architecture, optimizer, resolution, augmentation, or training duration.