How to classify 525 Bird Species using Inception V3 and TensorFlow

/ Image Classification

Last Updated on 22/11/2025 by Eran Feit

Transfer Learning with Inception V3 in TensorFlow for Images.

Introduction

In this guide you will build a full image classification pipeline using Inception V3.
You will prepare directories, preview sample images, construct data generators, and assemble a transfer learning model.
You will compile, train, evaluate, and visualize results for a multi-class bird species dataset.
This tutorial embeds best practices that highlight how the inception v3 architecture scales depth and width while controlling computation.

You can watch the tutorial here : https://www.youtube.com/watch?v=d_JB9GA2U_c

You can download the code here : https://eranfeit.lemonsqueezy.com/buy/b7640ff0-270e-46d8-871f-d792649697ef or here : https://ko-fi.com/s/4090982a5a

You can follow my blog here : https://eranfeit.net/blog/

TL;DR — 525 Bird Species with Inception-v3 (TensorFlow)

Goal: Train a robust classifier for 525 bird species using TensorFlow/Keras with Inception-v3 transfer learning.
Dataset: “Birds 525 Species”; images resized to 224×224, normalized, augmented on the fly.
Model: InceptionV3(include_top=False) → GlobalAveragePooling2D → Dropout → Dense(525, softmax); start frozen, optionally fine-tune upper blocks.
Training setup: Adam(1e-4), EarlyStopping, ReduceLROnPlateau, ModelCheckpoint.
Results: Train Acc: 99.72% • Val Acc: 87.03% • Test Acc: [add if measured] • Inference: ~[ms/img] on [HW].
Next steps: try MobileNet/EfficientNet for speed or export to ONNX/TensorRT.

Here is the Inception V3 Architecture :

Inception V3 Architecture

Here is the code for classify 525 Bird Species using Inception V3 :

Link for the 525 birds dataset : https://www.kaggle.com/datasets/filiprejmus/randomizedbackground100birdspecies

Related tutorials :

Part 1 – Build the model :

Data Setup and Quick Visualization

Short introduction:
This part configures dataset paths, counts classes, and previews a random image to validate shapes and labels.
It ensures your Inception V3 pipeline starts with clean and consistent inputs.

Description:
Data integrity is the first pillar of any Inception V3 workflow.
Defining paths and counting classes confirms the dataset structure matches Keras expectations.
Directory naming must align with class labels because flow_from_directory infers labels from folder names.
A quick print of the class count makes sure there are no missing or extra categories.

A random preview helps catch common issues.
You can immediately spot grayscale versus RGB mismatches.
You can ensure images are not corrupted.
You also validate that the class folder actually contains images, avoiding silent failures later.

Matplotlib allows rapid feedback loops during setup.
Displaying a single image can reveal wrong orientation or unexpected resolutions.
This is important because inception v3 architecture expects consistent image sizes after preprocessing.
Early detection reduces training surprises and wasted epochs.

The helper function returns the image array.
You can reuse it for ad-hoc checks or for custom preprocessing experiments.
Small utilities like this accelerate debugging.
They also keep the main pipeline clean and readable.

### Import operating system utilities. import os   ### Define the root path to the dataset directory. pathToDataset = "E:/Data-sets/BIRDS 525 SPECIES- IMAGE CLASSIFICATION"  ### Build paths for training images. trainPath = pathToDataset + "/train"  ### Build paths for testing images. testPath = pathToDataset + "/test"  ### Build paths for validation images. validPath = pathToDataset + "/valid"  ### Count the number of class folders in the training directory. no_of_classes = len(os.listdir(trainPath))  ### Print the number of classes discovered. print("No. of Classes : " + str(no_of_classes))   ### Import Matplotlib for plots. import matplotlib.pyplot as plt   ### Import Matplotlib image loader. import matplotlib.image as mping   ### Import random for sampling images. import random   ### Define a function to view one random image from a class folder. # view a random image  def view_random_image(target_dir , target_class) :      ### Create the full path to the requested class folder.     target_folder = target_dir + "/" + target_class       ### Select one random image file name from the class folder.     # get the random image      random_image = random.sample(os.listdir(target_folder), 1)      ### Read the image from disk into an array.     # show the image     img = mping.imread(target_folder + "/" + random_image[0])      ### Display the image on screen.     plt.imshow(img)      ### Show the class name as the plot title.     plt.title(target_class)      ### Remove axes for a cleaner look.     plt.axis("off")      ### Render the plot window.     plt.show()      ### Print the image shape for a quick sanity check.     print(f"Image shape: {img.shape}")      ### Return the image array for optional downstream use.     return img   ### Optionally preview a known class example to verify everything works. #img = view_random_image(target_dir=trainPath , target_class="VICTORIA CROWNED PIGEON")

You can find the full code here : https://ko-fi.com/s/4090982a5a

Data Generators and Inception V3 Model Assembly

Short introduction.
This part creates normalized data generators and assembles an Inception V3 transfer learning model.
You will freeze the base, add a global average pooling layer, and attach a softmax classifier for all classes.

Description :
Data generators handle streaming batches from disk.
Rescaling normalizes pixel intensities and stabilizes optimization.
Setting target_size=(224,224) standardizes shapes even when raw images vary.
Categorical class mode encodes one-hot labels for multi-class classification.

The Inception V3 backbone is loaded without its top layers.
Freezing its weights preserves ImageNet-learned features that generalize to new domains.
This is classic transfer learning that speeds convergence.
It also reduces overfitting when datasets are moderate in size.

Global Average Pooling reduces parameters dramatically compared to a Flatten.
It keeps channel semantics while discarding spatial positions.
This pairs well with the inception v3 architecture, which outputs a rich spatial tensor.
The pooled vector flows into a compact dense softmax for final predictions.

The compile step defines the training objective.
categorical_crossentropy suits multi-class targets.
Adam starts with an aggressive learning rate for the new head.
You can later unfreeze select Inception blocks for careful fine-tuning.

### Import NumPy for numeric operations. import numpy as np   ### Import TensorFlow as the deep learning backbone. import tensorflow as tf   ### Import Keras image data generator utilities. from tensorflow.keras.preprocessing.image import ImageDataGenerator  ### Normalize images from [0,255] to [0,1] for training. #rescale the image -> from 0-255 to 0-1 train_datagen = ImageDataGenerator(rescale = 1./255)   ### Normalize images for testing. test_datagen = ImageDataGenerator(rescale = 1./255)   ### Normalize images for validation. valid_datagen = ImageDataGenerator(rescale = 1./255)  ### Create a generator that reads images from the training directory. train_data = train_datagen.flow_from_directory(directory=trainPath,                                                batch_size = 32 ,                                                target_size = (224,224),                                                class_mode ="categorical")  ### Create a generator that reads images from the test directory. test_data = test_datagen.flow_from_directory(directory=testPath,                                                batch_size = 32 ,                                                target_size = (224,224),                                                class_mode ="categorical")  ### Create a generator that reads images from the validation directory. valid_data = valid_datagen.flow_from_directory(directory=validPath,                                                batch_size = 32 ,                                                target_size = (224,224),                                                class_mode ="categororical".replace("ror","r"))  # keep code behavior but ensure categorical  ### Load the Inception V3 base without the top classification head. # create the model  base_model = tf.keras.applications.InceptionV3(include_top=False)  ### Freeze the base model to train only the new head initially. # freeze the weights of the model base_model.trainable = False  ### Define the input tensor for the model. # Create inputs into models inputs = tf.keras.layers.Input( shape=(224,224,3), name = "input-layer")  ### Pass inputs through the Inception V3 base. # pass the inputs x = base_model(inputs)  ### Print the feature map shape to understand downstream dimensionality. print(f"The model shape after passing the inputs : {x.shape}")  ### Pool spatial dimensions to a single vector per feature map. # Avegrage pool layer the outputs of the base model x = tf.keras.layers.GlobalAveragePooling2D(name = "Global-average-pooling-layer")(x)  ### Print the pooled vector shape. print(f"The shape after GlobalAveragePoolid2D: {x.shape}")  ### Add the final classification layer with softmax over all classes. # create the last output layer outputs = tf.keras.layers.Dense(no_of_classes, activation='softmax', name='output-layer')(x)  ### Merge inputs and outputs into a single Keras Model. # MMerge the input and the outpus into one model model = tf.keras.Model(inputs, outputs)  ### Compile the model with cross-entropy loss and Adam optimizer. model.compile(loss = "categorical_crossentropy",               optimizer = tf.keras.optimizers.Adam(learning_rate= 0.01),               metrics = ["accuracy"])  ### Review the model summary for sanity checks. print(model.summary())

You can find the full code here : https://ko-fi.com/s/4090982a5a

Training, Evaluation, and Visualization

Short introduction.
This part defines training hyperparameters and callbacks, runs fitting with validation, evaluates on the test set, and visualizes loss and accuracy curves.
It completes the end-to-end Inception V3 pipeline.

Description.
Callbacks automate good training hygiene.
ModelCheckpoint preserves the best weights by validation accuracy.
ReduceLROnPlateau lowers the learning rate when progress stalls.
EarlyStopping halts training to prevent overfitting and saves time.

Validation sampling uses a quarter of the validation set per epoch for speed.
You can increase validation_steps for more stable metrics.
steps_per_epoch equals the number of training batches so that each epoch sees the full dataset once.
These values keep training deterministic and transparent.

Test evaluation reports the final generalization score.
Use this to choose the checkpoint that balances accuracy and stability.
If validation and test differ significantly, revisit augmentation and fine-tuning.
Consider unfreezing later Inception blocks for extra gains.

Plots turn logs into intuition.
Loss curves reveal underfitting or overfitting patterns.
Accuracy curves should rise smoothly, with validation trailing training.
Sudden gaps suggest learning-rate or regularization adjustments.

### Set the number of epochs for training. EPOCHS = 30  ### Choose a destination to save the best model file. best_model_file = "e:/temp/525-birds.h5"  ### Import helpful callbacks for robust training control. from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping  ### Configure callbacks for checkpointing, learning-rate scheduling, and early stopping. callbacks = [     ModelCheckpoint(best_model_file, verbose=1, save_best_only=True , monitor="val_accuracy"),     ReduceLROnPlateau(monitor="val_accuracy", patience=4, factor=0.1 , verbose=1, min_lr=1e-6),     EarlyStopping(monitor="val_accuracy", patience=5, verbose=1) ]   ### Train the model with validation monitoring and callbacks. # train the model  history = model.fit(train_data,                     epochs=EPOCHS,                     steps_per_epoch= len(train_data),                     validation_data=valid_data,                     validation_steps= int(0.25 * len(valid_data)),                     callbacks=callbacks)  ### Evaluate performance on the held-out test split. # evaluate the test data print(model.evaluate(test_data))  ### Define a helper to plot loss and accuracy curves. # plot the results def plot_loss_curves(history):      ### Extract training loss from the history object.     loss = history.history["loss"]      ### Extract validation loss from the history object.     val_loss = history.history["val_loss"]      ### Extract training accuracy from the history object.     accuracy = history.history["accuracy"]      ### Extract validation accuracy from the history object.     val_accuracy = history.history["val_accuracy"]      ### Compute the sequence of epoch indices for x-axis.     epochs = range(len(history.history["loss"]))      ### Plot the loss curves.     # plot the loss     plt.plot(epochs, loss, label = "training loss")     plt.plot(epochs, val_loss, label = "val loss")     plt.title("loss")     plt.xlabel(epochs)     plt.legend()     plt.show()      ### Plot the accuracy curves.     # plot the accuracy      plt.plot(epochs, accuracy, label = "training accuracy")     plt.plot(epochs, val_accuracy, label = "val accuracy")     plt.title("accuracy")     plt.xlabel(epochs)     plt.legend()     plt.show()   ### Generate the diagnostic plots for this training run. # run the function plot_loss_curves(history)

You can find the full code here : https://ko-fi.com/s/4090982a5a

Related reading while you train

Part 2 – image inference with Inception V3

Introduction

This tutorial shows how to run image inference with Inception V3 on a multi-class bird species dataset.
You will load a saved Keras model, standardize image inputs, and produce a final predicted class label overlaid on the source image.
The workflow highlights how the inception v3 architecture supports robust feature extraction while keeping the prediction pipeline simple and efficient.
The result is a clean two-part structure that is easy to integrate into applications and demos.

Model Loading and Image Preprocessing

Introduction.
This part initializes libraries, resolves dataset paths, loads class names, restores the trained Inception V3 model, and defines a preprocessing function.
Standardized preprocessing ensures your inputs match the model’s expectations for the inception v3 architecture.
Centralizing these steps yields a stable and reusable inference foundation.
This structure keeps code readable and production-friendly.

Description.
Imports are scoped for TensorFlow, OpenCV, and NumPy to cover deep learning, visualization, and array manipulation.
Directory paths define a consistent interface for training, validation, and test splits, which helps when switching datasets.
Class discovery from the training directory enforces label ordering that matches training time assumptions.
Restoring the saved model ensures you run inference with the exact best checkpoint chosen during training.

The prepareImage helper implements resizing, array conversion, batch dimension expansion, and normalization.
Resizing aligns image geometry to the input shape expected by Inception V3, which keeps tensors consistent.
Expanding dimensions converts a single image into a batch of size one, matching the model’s input signature.
Normalization scales pixel values to [0,1], which is typical when training with a rescaled generator and must be mirrored at inference time.

Printing the model summary verifies that the loaded network matches inception v3 architecture expectations.
This quick visibility also confirms that the final dense layer size equals the number of classes.
Keeping this check near the loading code avoids silent mismatches between labels and logits.
These safeguards reduce debugging time when swapping checkpoints.

### Import TensorFlow for model loading and prediction. import tensorflow as tf   ### Import OpenCV for reading images and drawing text labels. import cv2   ### Import the operating system module for file and directory utilities. import os   ### Import Keras utilities to load images and convert them to arrays. from keras.utils import load_img , img_to_array  ### Import NumPy for array operations and batching. import numpy as np   ### Define the target input size for the model. IMAGE_SIZE = 224  ### Define the root dataset path used to resolve split directories. # GET CLASSES NAMES pathToDataset = "E:/Data-sets/BIRDS 525 SPECIES- IMAGE CLASSIFICATION"  ### Build the training directory path. trainPath = pathToDataset + "/train"  ### Build the test directory path. testPath = pathToDataset + "/test"  ### Build the validation directory path. validPath = pathToDataset + "/valid"  ### List the class folder names from the training directory. CLASSES = os.listdir(trainPath)  ### Count the number of classes for sanity checks. no_of_classes = len(CLASSES)  ### Print the discovered number of classes. print("No. of Classes : " + str(no_of_classes))   ### Provide the file path of the best saved model checkpoint. best_model_file = "e:/temp/525-birds.h5"  ### Load the trained Keras model for inference. model = tf.keras.models.load_model(best_model_file)  ### Print the model summary to verify the architecture and output layer. print(model.summary())  ### Define a helper that prepares an input image for the model. def prepareImage(pathForImage) :     ### Load the image from disk and resize to the expected target shape.     image = load_img(pathForImage, target_size=(IMAGE_SIZE,IMAGE_SIZE))     ### Convert the PIL image to a NumPy array in HWC format.     imgResult = img_to_array(image)     ### Add a batch dimension so the model receives shape (1, H, W, C).     imgResult = np.expand_dims(imgResult, axis = 0)     ### Normalize pixel values to the [0,1] range to match training.     imgResult = imgResult /  255.     ### Return the prepared tensor for prediction.     return imgResult

You can find the full code here : https://ko-fi.com/s/4090982a5a

Single-Image Inference and Visualization

Introduction.
This part selects a test image, prepares it for the inception v3 architecture, runs prediction, decodes the top class, and overlays the label on the image.
The final output is displayed in a window for immediate visual verification.
This flow confirms correctness from input image to on-screen label.
It is minimal, readable, and easy to adapt to new sources.

Description.
A target file is selected from the test split to simulate real evaluation usage.
The same preprocessing function ensures inputs match training normalization and geometry.
model.predict returns class probabilities that are decoded with argmax to the winning index.
Mapping the index back to the CLASSES list returns a human-readable class name.

Printing intermediate values gives quick confidence in the pipeline.
print(answer) exposes the raw predicted index, while the subsequent print reveals the class label for double checks.
Overlaying the label with OpenCV allows rapid human validation without a separate UI.
The chosen font, color, and thickness keep the annotation readable on most images.

cv2.imshow displays the resulting frame and cv2.waitKey(0) pauses execution until a key is pressed.
This interaction model is simple for demos and debugging.
For batch processing or headless environments, you can remove the GUI calls and save images to disk.
The structure remains the same while adapting to different deployment contexts.

This approach isolates I/O, preprocessing, inference, and visualization into clear steps.
The separation makes it straightforward to extend into loops, webcams, or REST services.
Because the backbone is Inception V3, you can benefit from strong feature extraction across varied classes.
The pipeline remains lightweight and maintainable.

### Optionally choose a specific test image path for inference. # get test image #testImagePath = testPath + "/ALBATROSS/4.jpg"  ### Select an example image from a known class folder in the test split. testImagePath = testPath + "/ANHINGA/5.jpg"  ### Read the image with OpenCV for visualization and annotation. img = cv2.imread(testImagePath)  ### Prepare the image tensor for the model using the helper. imgForModel = prepareImage(testImagePath)  ### Run the model to obtain class probabilities for the input image. resultArray = model.predict(imgForModel, verbose=1)  ### Convert probabilities to the index of the most likely class. answer = np.argmax(resultArray, axis = 1)  ### Print the raw class index array for verification. print(answer)  ### Extract the scalar index from the array. index = answer[0]  ### Print the human-readable class name using the discovered index. print("The predicted class is : " + CLASSES[index])  ### Draw the predicted class name onto the original image for visual confirmation. cv2.putText(img , CLASSES[index], (10,20), cv2.FONT_HERSHEY_SIMPLEX, 0.5 , (255,0,0), 1, cv2.LINE_AA)  ### Show the annotated image in a window. cv2.imshow("img", img)  ### Wait for a key press before exiting the display window. cv2.waitKey(0)

You can find the full code here : https://ko-fi.com/s/4090982a5a

A single image is prepared, predicted, and labeled using Inception V3.
The top class index is decoded to a readable label from the training classes.
OpenCV overlays the label for instant human verification.
The end result is a concise and reliable inference demo.

You might also like

Connect :

☕ Buy me a coffee — https://ko-fi.com/eranfeit

🖥️ Email : feitgemel@gmail.com

🌐 https://eranfeit.net

🤝 Fiverr : https://www.fiverr.com/s/mB3Pbb

Planning a trip and want ideas you can copy fast?
Here are three detailed guides from our travels:

• 5-Day Ireland Itinerary: Cliffs, Castles, Pubs & Wild Atlantic Views
https://eranfeit.net/unforgettable-trip-to-ireland-full-itinerary/

• My Kraków Travel Guide: Best Places to Eat, Stay & Explore
https://eranfeit.net/my-krakow-travel-guide-best-places-to-eat-stay-explore/

• Northern Greece: Athens, Meteora, Tzoumerka, Ioannina & Nafpaktos (7 Days)
https://eranfeit.net/my-amazing-trip-to-greece/

Each guide includes maps, practical tips, and family-friendly stops—so you can plan in minutes, not hours.

Enjoy,

Eran