YOLOv5 Image Classification — Complete Tutorial

Leave a Comment / Image Classification

Last Updated on 11/10/2025 by Eran Feit

🟢 Introduction — Why Use YOLOv5 for Image Classification?

When most developers hear “YOLO,” they think of real-time object detection — boxes around cars, people, or animals.
But in recent versions, YOLOv5 introduced something equally powerful: YOLOv5-cls, a classification-only mode designed to label entire images instead of detecting objects.

In this mode, YOLOv5 combines the same speed and accuracy that made it famous for detection, with the simplicity of classical image classification.
It runs on the PyTorch framework, supports GPU acceleration through CUDA, and can outperform many older CNNs like VGG or ResNet when properly trained.

In this tutorial, you’ll build a complete YOLOv5 image classification pipeline using the Animals10 dataset from Kaggle — a collection of over 26,000 animal photos categorized into 10 species such as cats, dogs, horses, squirrels, and butterflies.
We’ll start from environment setup, continue through dataset preparation and model training, and finish by predicting new images interactively with OpenCV.

By following this post, you’ll learn how to:

Evaluate and visualize predictions directly in Python.
Configure a clean Python environment optimized for YOLOv5.
Prepare and split image datasets for classification.
Train a YOLOv5 classification model using your own GPU.

You can download the code here : https://ko-fi.com/s/479cabae46

You can find more tutorials in my blog : https://eranfeit.net/blog/

Link for the dataset : https://www.kaggle.com/datasets/alessiocorrado99/animals10

⚙️ Setting Up the Environment for YOLOv5 Classification

The first step in any deep learning project is building a stable environment.
Installing YOLOv5 correctly ensures that your training runs efficiently, with GPU support and compatible dependencies.
We’ll use Conda, a Python environment manager, to isolate dependencies and avoid version conflicts.
YOLOv5 runs best on Python 3.8–3.10, and this tutorial uses PyTorch 2.2.0 with CUDA 11.8 for NVIDIA GPUs.

Creating and Activating the Conda Environment

### Create a new Conda environment with Python 3.8 conda create --name YoloV5 python=3.8  ### Activate the environment conda activate YoloV5

Explanation

Each Conda environment acts as a self-contained workspace.
Here, we name it YoloV5 so all dependencies related to this tutorial stay organized.
After activation, all installations will apply only inside this environment, leaving your base Python untouched.

Cloning the YOLOv5 Repository

### Clone YOLOv5 from GitHub into your chosen folder git clone https://github.com/ultralytics/yolov5.git

This downloads the latest YOLOv5 source code directly from Ultralytics’ GitHub page.
Make sure you place it in a descriptive directory, like D:/Cool-Python/YoloV5, to keep projects organized.

Installing PyTorch with CUDA

### Check CUDA availability nvcc --version  ### Install PyTorch 2.2.0 with CUDA 11.8 conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia

CUDA allows PyTorch to use your NVIDIA GPU for parallel computation, dramatically speeding up training.
Always verify that your CUDA version matches your GPU driver.

Installing Additional Dependencies

### Core scientific and visualization libraries pip install gitpython>=3.1.30 matplotlib>=3.3 numpy>=1.22.2 opencv-python>=4.1.1  ### Utility packages for training and monitoring pip install Pillow>=10.0.1 psutil PyYAML>=5.3.1 requests>=2.23.0 scipy>=1.4.1 thop>=0.1.1 tqdm>=4.64.0  ### YOLOv5 training tools pip install ultralytics==8.0.235 tensorboard albumentations  ### Ensure OpenCV is installed properly (fix conflicts) pip uninstall -y opencv-python-headless pip install opencv-python>=4.1.1

Each library serves a specific role:

Matplotlib & NumPy handle numerical computation and plotting.
Albumentations adds powerful image augmentation.
TensorBoard visualizes metrics such as loss curves.
OpenCV lets us load, resize, and display images during predictions.

After completing this setup, your environment is ready for YOLOv5 classification tasks.
At this stage, you can test the installation by running python detect.py --help inside the YOLOv5 folder — it should execute without errors.
Now, let’s prepare your dataset.

🚀 Want to get started with Computer Vision or take your skills to the next level ?

If you’re just beginning, I recommend this step-by-step course designed to introduce you to the foundations of Computer Vision – Complete Computer Vision Bootcamp With PyTorch & TensorFlow

If you’re already experienced and looking for more advanced techniques, check out this deep-dive course – Modern Computer Vision GPT, PyTorch, Keras, OpenCV4

🐾 Preparing and Splitting the Dataset

The Animals10 dataset is structured with each class (e.g., “cat”, “dog”) in its own folder.
For training, we need to divide these images into two sets — train (90%) and validation (10%).
This ensures our model learns from one subset and is evaluated on unseen examples, preventing overfitting.

Here’s the complete data-splitting script:

### Import essential libraries import os import shutil import random import cv2  ### Define a function to split the dataset into train and validation folders def split_images(input_folder, output_folder, split_ratio=0.9):     # Create the target directories     train_folder = os.path.join(output_folder, 'train')     validate_folder = os.path.join(output_folder, 'val')     num = 0      os.makedirs(train_folder, exist_ok=True)     os.makedirs(validate_folder, exist_ok=True)      # Identify each class folder     subfolders = [f.name for f in os.scandir(input_folder) if f.is_dir()]      for subfolder in subfolders:         subfolder_path = os.path.join(input_folder, subfolder)         train_subfolder_path = os.path.join(train_folder, subfolder)         validate_subfolder_path = os.path.join(validate_folder, subfolder)          os.makedirs(train_subfolder_path, exist_ok=True)         os.makedirs(validate_subfolder_path, exist_ok=True)          # Retrieve all image filenames         images = [f.name for f in os.scandir(subfolder_path) if f.is_file()]         num_images = len(images)         num_validate = int(num_images * (1 - split_ratio))         validate_images = random.sample(images, num_validate)          for image in images:             source_path = os.path.join(subfolder_path, image)             img = cv2.imread(source_path)             name = str(num) + ".png"             if img is not None:                 if image in validate_images:                     destination_path = os.path.join(validate_subfolder_path, name)                 else:                     destination_path = os.path.join(train_subfolder_path, name)                 cv2.imwrite(destination_path, img)             else:                 print("Invalid image or file not found.")             print(destination_path)             num += 1  ### Run the function input_folder = "E:/Data-sets/Animals-10/raw-img" output_folder = "C:/Data-sets/Animals-10/raw-img" split_images(input_folder, output_folder)

Explanation

This function ensures:

Automatic dataset structure creation: It generates /train and /val subdirectories.
Randomized splitting: The random.sample() call ensures images are shuffled each run.
Data integrity check: Invalid or unreadable images are skipped.
Renaming convention: Each image is renamed numerically, simplifying indexing later.

By keeping a clear folder hierarchy, YOLOv5 automatically detects your classes during training.
If you add more categories in the future, simply place them into the dataset folder and rerun the script.

🧠 Training, Evaluating, and Predicting with YOLOv5

Training is where your model learns to distinguish visual patterns that define each class.
YOLOv5 uses a modern convolutional backbone optimized for speed, with classification heads replacing detection layers.
You can train on multiple GPU devices or limit it to one, depending on your setup.

Training the Model

### Clean GPU memory by restarting the system ### Then open Anaconda Prompt as Administrator  cd yolov5 git pull origin master  ### Train a YOLOv5 small classification model (s = small) python classify/train.py --model yolov5s-cls.pt --data C:/Temp/Data-s

Explanation

--model yolov5s-cls.pt loads the small pre-trained classification model.
--img 224 sets image size, matching typical CNN inputs.
--epochs 500 gives enough training time for convergence.
--cache stores preprocessed data in memory, boosting speed.

After training, the results are stored in:

runs/train-cls/exp5/weights/best.pt

This file contains your optimized model ready for inference.

To compare YOLOv5’s results with other CNNs, explore my guide on VGG19 Transfer Learning for Image Classification.

Testing the Model

model evaluation

python classify/predict.py --weights runs/train-cls/exp5/weights/best.pt --img 224 --source images --name exp_images

This command evaluates your model on the test images located inside the images directory.
The predictions will be saved automatically in a new folder under runs/predict-cls/exp_images.

Predicting a Single Image in Python

Test Images :

Elephant Test image — Elephant

Butterfly Test Image — Butterfly

### Import libraries import torch from torchvision import transforms from PIL import Image  import cv2  import os   ### Load the YOLOv5 trained model weights_path = "C:/tutorials/yolov5/runs/train-cls/exp3/weights/best.pt" device = 'cuda' if torch.cuda.is_available() else 'cpu' model = torch.hub.load('ultralytics/yolov5','custom', path=weights_path, force_reload=True) model.eval()  ### Define preprocessing transformations transform = transforms.Compose([     transforms.Resize((224,224)),     transforms.ToTensor(), ])  ### Load an image and convert it to tensor imagePath = "Best-image-classification-models/YoloV5/Images/El1.jpg" img = Image.open(imagePath).convert('RGB') img = transform(img).unsqueeze(0).to(device)  ### Run inference with torch.no_grad():     outputs = model(img) predicted_class = torch.argmax(outputs, dim=1).item()  ### Map prediction to class name categories = os.listdir("C:/Data-sets/Animals-10/raw-img/train") text = "Predicted class : " + categories[predicted_class]  ### Display the prediction on the image imgDisplay = cv2.imread(imagePath) scale_percent = 35  width = int(imgDisplay.shape[1] * scale_percent / 100) height = int(imgDisplay.shape[0] * scale_percent / 100) dim = (width, height) resized = cv2.resize(imgDisplay, dim, interpolation=cv2.INTER_AREA) cv2.putText(resized, text, (10,30), cv2.FONT_HERSHEY_SIMPLEX, 1 , (0,255,0), 2) cv2.imshow("test image", resized) cv2.waitKey(0) cv2.destroyAllWindows()

Explanation

Here, we manually load the trained weights and apply image preprocessing consistent with training.
YOLOv5 outputs the class index, which we map to the correct folder name.
Finally, the prediction is overlayed on the image using OpenCV — a great way to visualize results interactively.

For extending this model into segmentation, check out Automatic Mask Generation with YOLOv8 and SAM2.

Summary

This final section completes the YOLOv5 image classification loop — from dataset to real prediction.
The model can now identify animals or any other visual categories you train it on.
You can further improve accuracy using Albumentations, transfer learning, or larger model variants like yolov5m-cls.pt or yolov5x-cls.pt.

📘 FAQ

What is YOLOv5-cls?

YOLOv5-cls is the classification version of YOLOv5 used for labeling entire images instead of detecting objects.

Can I train YOLOv5 for custom datasets?

Yes, YOLOv5 supports custom datasets for classification and detection tasks.

Do I need GPU for training?

While you can train on CPU, using a CUDA-enabled GPU dramatically reduces training time.

How are images split into training and validation sets?

A custom Python function randomly assigns 90% of images to training and 10% to validation.

What format should the dataset follow?

Each class must have its own subfolder containing relevant images in both train and val directories.

What does the model output after prediction?

It outputs the predicted class index and probability for each input image.

Where are the trained weights stored?

Weights are saved under the folder runs/train-cls/exp*/weights/best.pt.

Can YOLOv5 classify videos?

Yes, by processing each frame as an image, YOLOv5 can classify video content frame-by-frame.

Is data augmentation applied?

Yes, the Albumentations library provides image transformations that enhance model generalization.

What’s next after training?

You can export the trained model for deployment or integrate it into applications for automated image categorization.

🧩 Conclusion

YOLOv5 proves that you don’t need a separate network for every vision task — the same architecture that powers world-class object detectors can also classify images efficiently.
Through this guide, you’ve learned how to set up the environment, prepare your dataset, train a YOLOv5 classifier, and predict new images in Python.

The real advantage lies in YOLOv5’s modularity — you can move from classification to detection or segmentation with only minor code changes.
Its flexibility and performance make it ideal for students, researchers, and developers alike.

Once you master these basics, try fine-tuning the network with custom augmentations, experimenting with larger YOLOv5 models, or even building a simple Streamlit web app to deploy your classifier interactively.

Deep learning becomes far more approachable when frameworks like YOLOv5 do the heavy lifting — leaving you to focus on creativity and exploration.

Connect

☕ Buy me a coffee — https://ko-fi.com/eranfeit
🖥️ Email: feitgemel@gmail.com
🌐 https://eranfeit.net
🤝 Fiverr: https://www.fiverr.com/s/mB3Pbb

Enjoy,
Eran

Leave a Comment Cancel Reply