...

How to Build a DenseNet201 Model for Sports Image Classification

Image Classification Using Densnet Mode

Last Updated on 07/11/2025 by Eran Feit

Introduction to Building an Olympic Sports Classifier

In this tutorial, we’ll walk through building a sports image classification model that can automatically recognize Olympic sports from images.

The goal is to create a pipeline that:

  • Prepares and organizes image datasets of different Olympic sports.
  • Trains a DenseNet201 deep learning model to classify sports categories.
  • Tests the trained model on unseen images to verify predictions.

By the end of this guide, you’ll have a complete working solution for image classification using Python, TensorFlow, and OpenCV.

This step-by-step breakdown ensures that both beginners and experienced developers can follow along. The entire blog demonstrates how the DenseNet201 transfer learning model delivers high accuracy in identifying sports categories, making it an excellent project for deep learning enthusiasts.

You can watch the tutorial here : https://www.youtube.com/watch?v=TJ3i5r1pq98

Here is a link for Medium post : https://medium.com/pythoneers/how-to-build-a-densenet201-model-for-sports-image-classification-03e74e14bf1a

You can follow my blog here : https://eranfeit.net/blog/


DenseNet Architecture Introduction

In a standard Convolutional Neural Network, we have an input image, that is then passed through the network to get an output predicted label in a way where the forward pass is pretty straightforward as shown in the image below:

NN
How to Build a DenseNet201 Model for Sports Image Classification 7

Each convolutional layer except the first one (which takes in the input image), takes in the output of the previous convolutional layer and produces an output feature map that is then passed to next convolutional layer. For L layers, there are L direct connections – one between each layer and its subsequent layer.

The DenseNet architecture is all about modifying this standard CNN architecture like so:

densenet
How to Build a DenseNet201 Model for Sports Image Classification 8

In a DenseNet architecture, each layer is connected to every other layer, hence the name Densely Connected Convolutional Network. For L layers, there are L(L+1)/2 direct connections. For each layer, the feature maps of all the preceding layers are used as inputs, and its own feature maps are used as input for each subsequent layers.

This is really it, as simple as this may sound, DenseNets essentially conect every layer to every other layer. This is the main idea that is extremely powerful. The input of a layer inside DenseNet is the concatenation of feature maps from previous layers.

From the paper: > DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.

Want a simpler baseline first? Here’s a MobileNet tutorial that shows the full classification flow with lightweight compute.
(Link to: MobileNet Image Classification With Keras In Python) Transfer learning with Mobilenet


Here is the code for Building an Olympic Sports Classifier3 :

Getting Started with Imports and Dataset Setup

This first part of the code is about setting up the working environment. We import the necessary libraries, define the input image shape, and load the dataset categories.

Description :

  • NumPy is essential for handling images as arrays and later converting them to tensors for training.
  • OpenCV (cv2) allows us to read, resize, and display images.
  • os ensures we can navigate directories and fetch categories (sports types).
  • Standardizing image size (224x224) is crucial because deep learning models require fixed input dimensions.
  • Printing categories helps us verify that the dataset is loaded correctly.
### Importing NumPy for numerical operations import numpy as np   ### Importing OpenCV for image processing import cv2   ### Importing OS for file and directory handling import os   ### Defining the shape of all input images input_shape = (224,224)  ### Path to the dataset folder path = "E:/Data-sets/olympics"   ### Listing all categories (sports classes) inside the dataset folder categories = os.listdir(path)  ### Sorting the categories alphabetically for consistency categories.sort()  ### Printing the list of categories print(categories)  ### Printing the total number of categories print(len(categories))  ### Separator line for readability print ("==================================================") 

Summary: In this section, we loaded and verified the dataset structure while ensuring all images will share the same dimensions before training.

Defining the Data Preparation Function

This part defines the main function that processes every image in the dataset. It resizes, validates, and stores both images and labels.

Description:

  • The prepareData function ensures uniformity across all images.
  • Invalid images are skipped, preventing crashes during training.
  • Images are resized using cv2.INTER_AREA interpolation, which is effective for shrinking images.
  • Labels are linked with their categories, ensuring correct supervised learning.
  • Converting lists to NumPy arrays makes later processing with TensorFlow/Keras faster and more efficient.
### Function to prepare all images and labels def prepareData(path):     Images = []     Lables = []      ### Loop over each category (sport)     for category in categories:         fullPath = os.path.join(path,category)          ### List all files in the category folder         file_names = os.listdir(fullPath)          ### Loop over each file in the folder         for file in file_names:             file = os.path.join(fullPath, file)              ### Read the image             img = cv2.imread(file)              ### Check if image is valid             if img is not None:                  ### Resize the image to (224,224)                 resized = cv2.resize(img , input_shape, interpolation = cv2.INTER_AREA)                  ### Append resized image and its category label                 Images.append(resized)                 Lables.append(category)      ### Convert list of images to NumPy array     Images = np.array(Images)                 ### Convert labels list to NumPy array     Lables = np.array(Lables)      ### Return images and labels     return Images , Lables 

Summary: This part ensures all images are preprocessed correctly and linked with their labels, making them ready for training.

Executing Data Preparation and Saving Results

Finally, the function is executed to prepare the data, some sample images are displayed, and the dataset is saved for future use.

Description:

  • The dataset is loaded and verified with shapes (rows = images, columns = pixels).
  • Two sample images are displayed to check that the resizing and labeling worked correctly.
  • The processed dataset is stored as .npy files (NumPy binary format) for fast reloading without repeating preprocessing.
  • This saves a lot of time in future experiments, as we won’t need to reload and resize from scratch.
### Calling the data preparation function allImages , allLables = prepareData(path)  ### Printing the shapes of the image and label arrays print(allImages.shape) print(allLables.shape)  ### Displaying two sample images with their labels img_A , img_B = allImages[0],   allImages[17] label_A , label_B = allLables[0], allLables[17]  print(label_A, label_B)  cv2.imshow("img1", img_A) cv2.imshow("img2" , img_B) cv2.waitKey(0)  ### Saving processed data into .npy format print("Save the data .......") np.save("e:/temp/olymp-images-224.npy", allImages) np.save("e:/temp/olymp-labels-224.npy", allLables) print("Finish save the data .......") 

Summary: This final part loads, verifies, and saves the dataset in an optimized format, setting the foundation for model training.


If you prefer another transfer-learning backbone, try my ResNet50 walkthrough here — it’s a friendly alternative for image classification projects using Python and Keras.
(Link to: Alien vs Predator Image Classification with ResNet50 | Complete Tutorial) Alien vs predator

Training the Deep Learning Model for Olympic Sports Classification

Loading the Dataset and Preparing for Training

The first step in model training is loading the saved dataset and preparing it for neural network training. This includes normalization and splitting into training/testing sets.

Desciption:

  • NumPy loads the .npy files generated in Code 1, saving preprocessing time.
  • We check the shapes of the arrays to confirm dataset integrity.
  • Images are normalized to values between 0 and 1, improving training convergence and stability.
  • Splitting into training (70%) and testing (30%) ensures fair evaluation of the model’s performance on unseen data.
  • A fixed random_state=42 makes results reproducible.
### Importing essential libraries import numpy as np  import tensorflow as tf   ### Load preprocessed images and labels from .npy files allImages = np.load("e:/temp/olymp-images-224.npy") allLabels = np.load("e:/temp/olymp-labels-224.npy")  ### Print the shapes of the loaded arrays print(allImages.shape) print(allLabels.shape)  ### Import OpenCV for visualization import cv2   ### Show a sample image and its label img = allImages[0] label = allLabels[0]  print(label)  ### Normalize images by scaling pixel values between 0 and 1 allImagesForModel = allImages / 255.0  ### Split dataset into training and testing sets (70% train, 30% test) from sklearn.model_selection import train_test_split X_train, X_test , y_train , y_test = train_test_split(allImagesForModel, allLabels, test_size=0.3, random_state=42)  ### Print the shapes of split datasets print(X_train.shape) print(X_test.shape) print(y_train.shape) print(y_test.shape) 

Summary: This section loads the dataset, normalizes images, and splits them into training/testing sets for reliable model evaluation.

Encoding Labels and Building the DenseNet201 Model

After preparing images, we need to encode labels into numerical form and define the architecture of our DenseNet201-based deep learning model.

Desciption:

  • LabelEncoder converts string labels (e.g., “Basketball”, “Swimming”) into numeric values.
  • to_categorical creates one-hot encoded vectors, a format required for categorical classification.
  • DenseNet201 is used as the base model with pretrained ImageNet weights, helping transfer knowledge from millions of images.
  • We freeze the top layers and add:
    • GlobalAveragePooling2D: reduces parameters while retaining features.
    • Dense (256 ReLU): adds trainable layers for classification.
    • Dropout (0.5): prevents overfitting.
    • Dense (Softmax): outputs probabilities across 10 classes.
  • Adam optimizer with a low learning rate (1e-4) stabilizes training.
### Encode labels into numeric values from sklearn.preprocessing import LabelEncoder label_encoder = LabelEncoder()  y_train_encoded = label_encoder.fit_transform(y_train) y_test_encoded = label_encoder.fit_transform(y_test)  ### Convert labels into one-hot encoded vectors y_train_one_hot = tf.keras.utils.to_categorical(y_train_encoded) y_test_one_hot = tf.keras.utils.to_categorical(y_test_encoded)  ### Import DenseNet201 as base model from tensorflow.keras.applications import DenseNet201 from tensorflow.keras import layers, models  batch_size = 16 img_height , img_width = 224,224 num_classes = 10  ### Load DenseNet201 without its top layer base_model = DenseNet201(include_top=False , weights='imagenet', input_shape=(img_height,img_width,3))  ### Define the model architecture model = models.Sequential() model.add(base_model) model.add(layers.GlobalAveragePooling2D()) model.add(layers.Dense(256, activation='relu')) model.add(layers.Dropout(0.5)) model.add(layers.Dense(num_classes , activation='softmax'))  ### Set training parameters epochs=1000 lr = 1e-4 opt = tf.keras.optimizers.Adam(lr)  ### Print model summary print(model.summary()) 

Summary: This part encodes categorical labels and builds a transfer learning model using DenseNet201 as the backbone, customized for Olympic sports classification.

For another angle on transfer learning choices, see my ResNet50 vs. MobileNet comparison with practical preprocessing tips.
(Link to: TensorFlow Image Classification: ResNet50 vs. MobileNet) ResNet50 vs. MobileNet

Adding Callbacks, Compiling, and Training the Model

The final step involves defining callbacks to optimize training, compiling the model, and starting the training loop.

Desciption:

  • ModelCheckpoint: saves the best model based on validation accuracy.
  • ReduceLROnPlateau: decreases learning rate when accuracy stalls, allowing finer adjustments.
  • EarlyStopping: prevents overfitting by stopping training when improvements plateau.
  • We compile with categorical crossentropy loss since this is a multi-class classification problem.
  • Training runs for up to 1000 epochs, but callbacks ensure it may stop early once accuracy stabilizes.
  • Finally, we track the highest validation accuracy, ensuring the model’s best performance is captured.
### Import useful callbacks from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping  ### File to save the best model best_model_file = "e:/temp/olymp-10.h5"  ### Define callbacks callbacks = [     ModelCheckpoint(best_model_file, verbose=1, save_best_only=True , monitor="val_accuracy"),     ReduceLROnPlateau(monitor="val_accuracy", patience=5 , factor=0.1 , verbose=1, min_lr=1e-6),     EarlyStopping(monitor="val_accuracy", patience=20 , verbose=1) ]  ### Compile the model with loss and metrics model.compile(optimizer=opt , loss="categorical_crossentropy", metrics=['accuracy'])  ### Train the model hist = model.fit(     X_train, y_train_one_hot,      steps_per_epoch = (len(X_train) / batch_size),     validation_steps = (len(X_test) / batch_size),     epochs=epochs,     batch_size = batch_size,     shuffle = True,     validation_data = (X_test , y_test_one_hot),     callbacks = callbacks )  ### Print the highest validation accuracy highest_val_accuracy = max(hist.history["val_accuracy"]) print(f"Highest Validation Accuracy : {highest_val_accuracy}") 

Summary: This final section optimizes training using callbacks, compiles the model, and trains it to achieve high validation accuracy for Olympic sports image classification.

If you like building from scratch, this CNN tutorial (chess pieces) covers data splits, augmentation, and a clean Keras pipeline.
(Link to: CNN Image Classification With Keras Tutorial In Python) Building a Cnn model for Chess piece recognition


Evaluating the Trained Olympic Sports Classifier on New Images

We will use these images as test images for testing our model :

Loading the best model and reconstructing the environment

Description
This section focuses on restoring the trained model from disk and reconstructing the same runtime context used during training. That means importing the same libraries, re-creating constants (like the input shape), and rebuilding the category list in the exact order you used to train. If the categories aren’t read in the same sorted order, the predicted index could map to the wrong label, leading to confusing results.

A consistent environment also helps later when you package your solution or move it between machines. Loading the exact .h5 checkpoint ensures you’re not “almost” using the right weights—you’re using the best snapshot captured by ModelCheckpoint. It’s also good hygiene to print the model summary once, so you know you’ve really loaded what you expect (DenseNet backbone + your classification head).

Finally, it’s useful to print the discovered categories. This doubles as a sanity check and gives you a quick glance at the number of classes. If the printed list or length is off, fix that before running predictions; mismatches here are the most common cause of wrong label text overlay on images.

### Import TensorFlow for loading the trained model and running inference import tensorflow as tf   ### Import OS for directory operations (to rebuild the categories list) import os   ### Import NumPy for array operations import numpy as np   ### Import OpenCV for image loading, resizing, and visualization import cv2   ### Path to the best trained model you saved during training best_model_file = "e:/temp/olymp-10.h5"  ### Load the trained Keras model from disk model = tf.keras.models.load_model(best_model_file)  ### Print the model architecture to verify successful load and expected layers print(model.summary())  ### Define the model's expected input size for test-time preprocessing input_shape = (224,224)  ### Keep batch_size handy if you later batch multiple test images batch_size = 16   ### Base folder where your class folders (sports) are stored path = "E:/Data-sets/olympics"   ### List all category folder names from disk (must match training source) categories = os.listdir(path)  ### Sort categories to keep the same label order used in training categories.sort()  ### Verify class names visually for sanity checking print(categories)  ### Confirm the number of categories matches the model's output units print(len(categories)) 

Summary
You restored the exact best model, re-created the input size, and rebuilt the labels list in a stable order. These small steps prevent label/index drift and ensure your predictions render the correct sport name.

Preprocessing a new image and running the prediction

Description
Good predictions are born in preprocessing. Here you encapsulate the standard steps—resize, expand batch dimension, and normalize—into a tiny helper function. This makes your test-time script reliable and reusable. If you later change input size or normalization logic, you’ll only update a single function.

Note the order: resize first, then expand to shape (1, H, W, 3), then scale to [0,1]. That matches the training pipeline where images were already normalized. Keeping inference preprocessing identical to training is a top-3 rule for stable results.

When calling model.predict, you’ll get a probability distribution across your num_classes. Taking argmax reduces that to the predicted index. The final mapping from index → label name is done via the categories list you printed earlier. If the top-1 confidence looks suspiciously low, you can also inspect np.sort(result) or show top-3.

### Define a utility to resize, add batch dimension, and normalize an image def prepareImage(img) :     ### Resize to the model input shape with area interpolation (good for downscaling)     resized = cv2.resize(img, input_shape , interpolation = cv2.INTER_AREA)     ### Add a batch dimension so the array shape becomes (1, H, W, 3)     imgResult = np.expand_dims(resized , axis= 0)     ### Normalize pixel values to [0,1] to match training-time scaling     imgResult = imgResult / 255.     ### Return the preprocessed tensor     return imgResult  ### Option 1: keep alternative image paths in comments to switch quickly during testing #testImagePath = "Best-image-classification-models/DenseNet201-Olympic Games/Rugbi-Google.jpg"  ### Provide the actual test image path to classify now testImagePath = "Best-image-classification-models/DenseNet201-Olympic Games/WLjpg.jpg"  ### Read the raw image from disk (BGR layout in OpenCV) img = cv2.imread(testImagePath)  ### Apply your preprocessing pipeline so it matches the training distribution ImageForModel = prepareImage(img)  ### Run a forward pass through the network to obtain class probabilities result = model.predict(ImageForModel , verbose=1)  ### Print the raw probability vector for debugging or inspection print(result)  ### Collapse probabilities to the highest scoring class index answers = np.argmax(result , axis=1)  ### Print the numeric class index (useful sanity check) print(answers)  ### Convert the class index to the human-readable label via your categories list text = categories[answers[0]]  ### Print the predicted label text for the console log print("The predicted class is : " + text) 

Summary
You standardized preprocessing, executed the forward pass, and translated logits to a readable label. The labels file system order is your single source of truth, so keeping it stable (sorted) is essential.

Visualizing the prediction on the image

Description
A crisp visualization loop closes the feedback cycle. By overlaying the predicted text right onto the image, you can spot wrong guesses immediately and build a visual test set quickly. Consider placing the text on a contrasting color and adding a slight thickness so it reads well on lighter backgrounds. You’re already using yellow (0,255,255) and a clean serif-ish font (FONT_HERSHEY_COMPLEX), which is readable for demos.

Since OpenCV uses BGR, and most displays are fine with that, you can stay as-is for imshow. When demoing many images, add non-blocking windows or key bindings (e.g., press 'n' to move to the next image). Also, call destroyAllWindows() at the end to clean up native OS handles.

For production or notebooks, you might save outputs to disk (cv2.imwrite) or render to web via Streamlit/Gradio. For now, your script shows the simplest possible interactive preview.

Here’s your display block with per-line commentary:

### Choose an OpenCV font for overlaying the predicted label font = cv2.FONT_HERSHEY_COMPLEX  ### Draw the predicted text near the top-left corner in yellow with thickness 2 cv2.putText(img , text , (20,20) , font , 1, (0,255,255), 2) # Yellow color  ### Open an OS-native window to preview the labeled image cv2.imshow("img", img)  ### Wait indefinitely until a key is pressed (press any key to continue) cv2.waitKey(0)  ### Close all OpenCV windows to release system resources cv2.destroyAllWindows() 

Summary
You added a clear visual overlay and a minimal interactive viewer that blocks until a keypress. It’s perfect for manual spot-checking predictions before automating larger evaluation runs.

Next, explore my bird-species classification with Inception-V3 to see how architecture choice impacts training dynamics and accuracy.
(Link to: How to classify 525 Bird Species using Inception V3 and TensorFlow) How to classify 525 bird species using inception


FAQ :

Do I have to use 224×224 for DenseNet201?

224×224 is a proven default that balances accuracy and compute. Larger inputs can help slightly but cost more memory and time.

Why normalize images to 0–1 at inference?

It matches training-time scaling and the expectations of pretrained weights, improving stability and accuracy.

How do I prevent label/index mismatches?

Always sort class folders and rebuild the categories list exactly as in training before mapping argmax to label names.

Which callbacks should I start with?

Use ModelCheckpoint, ReduceLROnPlateau, and EarlyStopping to capture the best weights and avoid overfitting.

Is DenseNet201 overkill for small datasets?

Not with transfer learning. Start frozen, then fine-tune top blocks if validation accuracy plateaus.

Can I batch multiple test images for speed?

Yes, stack them into a single tensor (N, H, W, 3) and call predict once to reduce overhead.

Why use GlobalAveragePooling instead of Flatten?

It reduces parameters and overfitting while preserving robust spatial features from the backbone.

What loss should I use for multi-class sports labels?

Use categorical_crossentropy with softmax outputs and one-hot encoded targets.

How do I know if my model is overfitting?

If training accuracy rises while validation stalls or drops, add dropout, augment data, or fine-tune fewer layers.

What if my predicted label looks wrong on the overlay?

Verify category ordering and normalization, then print the top-3 probabilities to check class confusion.


Conclusion

A solid multi-class classifier is mostly about consistency. You standardized input size, normalized images, and kept the category mapping stable from training to inference. That alone prevents half the common issues in production demos. Your training loop used battle-tested callbacks, which capture your best epoch, adapt the learning rate, and stop early when the curve flattens—practical safeguards that reduce frustration and wasted compute.

Architecturally, DenseNet201 provides a deep, feature-rich backbone that pairs well with a lightweight classification head. This split leverages the general visual features learned on ImageNet while letting your final dense layers specialize on the specifics of Olympic sports. When you’re ready to squeeze more accuracy, unfreezing a portion of the backbone with a lower learning rate often yields reliable gains.

From here, you can scale the pipeline. Add evaluation scripts to compute per-class metrics, log confusion matrices, and visualize top-K predictions across a validation gallery. If your dataset grows, consider curriculum-style augmentation and selective fine-tuning. For deployment, wrap your prepareImage and prediction steps in a tiny service, or demo with Streamlit/Gradio for instant sharing.

Most importantly, your workflow is repeatable. The .npy caches, checkpointed .h5, and deterministic category ordering mean you can rebuild results on another machine with confidence. That’s what turns a good notebook into a dependable project.


Connect

☕ Buy me a coffee — https://ko-fi.com/eranfeit

🖥️ Email : feitgemel@gmail.com

🌐 https://eranfeit.net

🤝 Fiverr : https://www.fiverr.com/s/mB3Pbb

Enjoy,

Eran

Eran Feit