Introduction
Image classification is one of the most exciting applications of computer vision. It powers technologies in sports analytics, autonomous driving, healthcare diagnostics, and more.
In this project, we take you through a complete, end-to-end workflow for classifying Olympic sports images — from raw data to real-time predictions — using EfficientNetV2S, a state-of-the-art deep learning model.
Our journey is divided into three clear steps:
- Dataset Preparation – Organizing and splitting images into training and testing sets.
- Model Training – Fine-tuning EfficientNetV2S on the Olympics dataset.
- Model Inference – Running real-time predictions on new images.
Starting with the Olympic Games Event Dataset, we:
- Split images into train, validation, and test sets.
- Fine-tune EfficientNetV2 with a custom classification head.
- Use data augmentation and training callbacks for best results.
- Evaluate model accuracy and loss.
- Predict and label new images with OpenCV.
The code is easy to adapt for any custom dataset, making it a practical template for real-world computer vision projects.
Watch the full tutorial here : https://youtu.be/wQgGIsmGpwo
You can download the dataset here : https://www.kaggle.com/datasets/yousefidris/ogedolympic-games-event-dateset
You can download the full code here : https://ko-fi.com/s/a24e1b1bd1
You can find more tutorials, and join my newsletter here : https://eranfeit.net/
Step 1 – Preparing and Splitting the Dataset
Before training a model, we need to organize the dataset. The Olympics dataset we use contains folders, each representing a different sport (e.g., swimming, gymnastics, athletics).
To ensure accurate evaluation:
- Training set (90%) – Used for teaching the model.
- Testing set (10%) – Used for measuring performance on unseen data.
The dataset split keeps the original folder structure so the model retains class labels. Shuffling images ensures a fair and balanced distribution between training and testing sets.
# ================================ # Code 1: Dataset Splitter (Olympics dataset example) # ================================ # --- 1.1 Imports & high-level settings --- # 1.1.1 Import standard libraries for filesystem work, copying files, and randomness import os import shutil import random # 1.1.2 Define the intended split ratio: 90% train, 10% test (descriptive comment) # 90% of the dataset -> train folder # 10% of the dataset -> test folder # --- 1.2 Configure dataset paths --- # 1.2.1 Path to the main dataset folder (source) main_folder = 'e:/data-sets/olympics' # 1.2.2 Paths to train and test destination folders train_folder = 'e:/data-sets/olympics-train' test_folder = 'e:/data-sets/olympics-test' # --- 1.3 Ensure destination folders exist --- # 1.3.1 Create train folder if missing os.makedirs(train_folder , exist_ok=True) # 1.3.2 Create test folder if missing os.makedirs(test_folder , exist_ok=True) # --- 1.4 Discover class subfolders in the source --- # 1.4.1 Initialize list to hold class folder paths subfolders = [] # 1.4.2 Loop through items in main dataset folder for f in os.scandir(main_folder): # 1.4.3 If it's a folder (class/category), add to list if f.is_dir(): subfolders.append(f.path) # --- 1.5 Define split percentage --- # 1.5.1 Percentage of files for training set train_precentage = 90 # --- 1.6 Per-class processing: create dirs, list/shuffle files, compute split --- # 1.6.1 Loop over each class/category subfolder for subfolder in subfolders: # 1.6.2 Extract class folder name subfolder_name = os.path.basename(subfolder) # 1.6.3 Paths for train/test subfolders for this class train_subfolder = os.path.join(train_folder , subfolder_name) test_subfolder = os.path.join(test_folder , subfolder_name) # 1.6.4 Create class subfolder in train/test directories os.makedirs(train_subfolder , exist_ok=True) os.makedirs(test_subfolder , exist_ok=True) # 1.6.5 List all files in this class folder files = [f.path for f in os.scandir(subfolder) if f.is_file()] # 1.6.6 Randomize file order random.shuffle(files) # 1.6.7 Calculate number of train files num_train_files = int( len(files) * (train_precentage / 100) ) # 1.6.8 Copy train files for file in files[:num_train_files]: shutil.copy(file, os.path.join(train_subfolder , os.path.basename(file))) # 1.6.9 Copy test files for file in files[num_train_files:]: shutil.copy(file, os.path.join(test_subfolder , os.path.basename(file))) # --- 1.7 Final status message --- # 1.7.1 Print confirmation print("Finish copy the files to Train and Test subfolder")
You can download the full code here : https://ko-fi.com/s/a24e1b1bd1
Step 2 – Training with EfficientNetV2S
With the dataset ready, it’s time to train the model. Instead of starting from scratch, we use transfer learning with EfficientNetV2S, which is already trained on millions of images from ImageNet.
Here’s how we adapt it to our project:
- Keep the feature extraction layers from EfficientNetV2S.
- Add custom layers to classify our specific sports categories.
- Fine-tune the model so it learns both general image features and sport-specific patterns.
To improve generalization, we apply data augmentation during training — random flips, zoom, and shearing — so the model learns to handle different camera angles, lighting conditions, and athlete positions.
We train for multiple epochs, using tools like EarlyStopping and ModelCheckpoint to save the best version of the model and avoid overfitting.
# ================================ # Code 2: Training EfficientNetV2S on Olympics Dataset # ================================ # --- 2.1 Imports & configuration --- # 2.1.1 Import TensorFlow and Keras layers API import tensorflow as tf from tensorflow.keras import layers from tensorflow.keras.preprocessing.image import ImageDataGenerator # 2.1.2 Import supporting libraries import numpy as np import os import matplotlib.pyplot as plt # 2.1.3 Define target image size (all images resized to 224x224) IMAGE_SIZE = 224 # --- 2.2 Load class names & count --- # 2.2.1 Path to dataset folder containing class subfolders imageFolder = 'e:/data-sets/olympics' # 2.2.2 Get a list of class names (folder names) CLASSES = os.listdir(imageFolder) # 2.2.3 Count number of classes num_classes = len(CLASSES) # 2.2.4 Print class names print(CLASSES) # 2.2.5 Print number of classes print(num_classes) # --- 2.3 Load pre-trained base model (EfficientNetV2S) --- # 2.3.1 Load EfficientNetV2S without the top classification layer # weights='imagenet' uses ImageNet pretrained weights # input_shape matches resized images # include_top=False removes the 1000-class output layer base_model = tf.keras.applications.EfficientNetV2S( weights='imagenet', input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3), include_top=False ) # 2.3.2 Allow all layers to be trainable (fine-tuning) base_model.trainable = True # --- 2.4 Build the full model --- # 2.4.1 Create a sequential model combining the base and new layers model = tf.keras.Sequential([ base_model, # Pre-trained feature extractor layers.GlobalAveragePooling2D(), # Pool features to vector layers.Dense(1024, activation='relu'), # Dense layer for learning patterns layers.Dropout(0.5), # Dropout for regularization layers.Dense(num_classes, activation='softmax') # Output layer for our classes ]) # --- 2.5 Compile the model --- # 2.5.1 Import Adam optimizer from tensorflow.keras.optimizers import Adam # 2.5.2 Create Adam optimizer with a small learning rate for transfer learning adam_opt = Adam(learning_rate = 0.0001) # 2.5.3 Compile model with categorical crossentropy (multi-class classification) model.compile( optimizer=adam_opt, loss='categorical_crossentropy', metrics=['accuracy'] ) # --- 2.6 Data generators (augmentation for training) --- # 2.6.1 Create training data generator with augmentation train_datagen = ImageDataGenerator( rescale=1./255, # Normalize pixel values shear_range=0.2, # Random shearing zoom_range=0.2, # Random zoom horizontal_flip=True # Flip images horizontally ) # 2.6.2 Create testing/validation data generator (no augmentation, only rescale) test_datagen = ImageDataGenerator(rescale=1./255) # 2.6.3 Paths to split datasets train_folder = 'e:/data-sets/olympics-train' test_folder = 'e:/data-sets/olympics-test' # 2.6.4 Training set generator train_generator = train_datagen.flow_from_directory( train_folder, target_size=(IMAGE_SIZE, IMAGE_SIZE), batch_size=8, class_mode='categorical', color_mode='rgb', shuffle=True ) # 2.6.5 Validation/test set generator test_generator = test_datagen.flow_from_directory( test_folder, target_size=(IMAGE_SIZE, IMAGE_SIZE), batch_size=8, class_mode='categorical', color_mode='rgb' ) # --- 2.7 Training parameters & callbacks --- # 2.7.1 Number of epochs for training EPOCHS = 300 # 2.7.2 Filepath to save the best model best_model_file = 'e:/temp/olympics-EfficientNetV2.h5' # 2.7.3 Import callbacks from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping # 2.7.4 Define callbacks callbacks = [ # Save the best model based on validation accuracy ModelCheckpoint(best_model_file, verbose=1, save_best_only=True, monitor="val_accuracy"), # Reduce learning rate if validation accuracy plateaus ReduceLROnPlateau(monitor="val_accuracy", patience=25, factor=0.1, verbose=1, min_lr=1e-6), # Stop training early if no improvement EarlyStopping(monitor="val_accuracy", patience=25, verbose=1) ] # --- 2.8 Train the model --- # 2.8.1 Fit model on training data with validation result = model.fit( train_generator, epochs=EPOCHS, validation_data=test_generator, callbacks=callbacks ) # --- 2.9 Find best validation accuracy --- # 2.9.1 Epoch index of best validation accuracy best_val_acc_epoch = np.argmax(result.history['val_accuracy']) # 2.9.2 Best validation accuracy value best_val_acc = result.history['val_accuracy'][best_val_acc_epoch] # 2.9.3 Print best validation accuracy print("Best validation accuracy : " + str(best_val_acc)) # --- 2.10 Plot accuracy curves --- # 2.10.1 Plot training vs validation accuracy plt.plot(result.history['accuracy'], label='train acc') plt.plot(result.history['val_accuracy'], label='val acc') plt.legend() plt.show() # --- 2.11 Plot loss curves --- # 2.11.1 Plot training vs validation loss plt.plot(result.history['loss'], label='train loss') plt.plot(result.history['val_loss'], label='val loss') plt.legend() plt.show()
You can download the full code here : https://ko-fi.com/s/a24e1b1bd1
Step 3 – Making Real-Time Predictions
The final step is to use the trained model to classify new, unseen images.
Our test image :
The prediction process includes:
- Preprocessing – Resize the image, normalize pixel values, and prepare it in the same way as during training.
- Prediction – The model outputs probabilities for each class.
- Result Extraction – The class with the highest probability becomes the prediction.
- Visualization – The predicted sport is overlaid on the image and displayed or saved.
This makes the system ready for practical use, instantly recognizing the sport in any given image.
import tensorflow as tf import cv2 import os from keras.utils import load_img , img_to_array import numpy as np IMAGE_SIZE = 224 # get the classes imageFolder = 'e:/data-sets/olympics' CLASSES = os.listdir(imageFolder) num_classes = len(CLASSES) best_model_file = 'e:/temp/olympics-EfficientNetV2.h5' model = tf.keras.models.load_model(best_model_file) def prepareImage(pathForImage) : image = load_img(pathForImage , target_size=(IMAGE_SIZE, IMAGE_SIZE)) img_result = img_to_array(image) img_result = np.expand_dims(img_result , axis=0) img_result = img_result / 255. return img_result testImagePath = "Best-image-classification-models/EfficientNetv2-Build-Model-Custom-Dataset/waterpolo-test-image.png" img = cv2.imread(testImagePath) ImgForModel = prepareImage(testImagePath) resultArray = model.predict(ImgForModel , verbose=1) answer = np.argmax(resultArray , axis=1) print(answer) index = answer[0] desc = CLASSES[index] print("The predicted class is : " + desc) scale_precent = 60 width = int(img.shape[1] * scale_precent / 100) height = int(img.shape[0] * scale_precent / 100) dim = (width, height) resized = cv2.resize(img, dim , interpolation = cv2.INTER_AREA) cv2.putText(resized , desc , (10,100), cv2.FONT_HERSHEY_SIMPLEX, 1.6, (255,0,0), 3 , cv2.LINE_AA) cv2.imwrite("e:/temp/waterpolo.png" , resized) cv2.imshow('img', resized) cv2.waitKey(0)
You can download the full code here : https://ko-fi.com/s/a24e1b1bd1
Connect
☕ Buy me a coffee — https://ko-fi.com/eranfeit
🖥️ Email : feitgemel@gmail.com
🤝 Fiverr : https://www.fiverr.com/s/mB3Pbb
Enjoy,
Eran