In this tutorial, we’ll explore how to use EfficientNetB0, a powerful deep learning model available in TensorFlow and Keras, for image classification. EfficientNet models are known for their efficiency and accuracy, making them an excellent choice for tasks like classifying objects in images.
We’ll use a pre-trained EfficientNetB0 model trained on the ImageNet dataset, which contains 1,000 different classes of images (such as animals, objects, and everyday items). This allows us to classify images without training a new model from scratch.
What Is EfficientNet-B0 (and why it’s fast)?
EfficientNet-B0 is the “base” model of the EfficientNet family.
It uses MBConv blocks (Mobile Inverted Bottlenecks) with Squeeze-and-Excitation (SE) attention and the SiLU/Swish activation to get strong accuracy with very few parameters.
Default input size is 224×224, and the model has ~5.3M parameters, making it light enough for CPU/edge while still competitive on ImageNet (top-1 ≈ 77%).
Key ideas in one minute
- Compound scaling: instead of only making the network deeper or wider or using bigger images, EfficientNet scales all three together with a single scaling factor φ (and constants α, β, γ).
- MBConv + SE: depthwise separable convolutions + channel attention drastically cut FLOPs without losing much accuracy.
- Swish/SiLU: smoother activation that helps gradient flow vs. ReLU.
When should you pick B0 (vs. B1/B3)?
- B0: great baseline for small/medium datasets, CPU or low-power devices, and fast iteration.
- B1/B2: move here if you need a few extra points of accuracy and can afford slightly larger images (240–260) and more VRAM.
- B3+: use when you have more data/GPU, and latency isn’t tight.
- Rule of thumb: start with B0, lock your pipeline, then scale up only if your validation accuracy plateaus due to model capacity (not data/overfitting).
The goal of this code is to:
- Load a pre-trained EfficientNetB0 model.
- Preprocess an input image into the required format.
- Run classification to predict what object is in the image.
- Display the top prediction with a label overlay on the image.
👉 Watch the full tutorial here : https://youtu.be/lomMTiG9UZ4
You can find more tutorials, and join my newsletter here : https://eranfeit.net/
Let’s break down the code step by step.
Efficientnet B0 – Code Explain :
We start by importing the required Python libraries.
EfficientNetB0
– loads the pre-trained EfficientNet model.cv2
– OpenCV, used for reading and displaying the image.image
from Keras – for converting the image into an array.numpy
– handles numerical computations and array manipulations.
Next, we load the EfficientNetB0 model with ImageNet weights.
This means the model has already been trained on millions of images across 1,000 classes.
We use OpenCV to read an image from the disk and resize it to the input size required by EfficientNetB0 (224×224 pixels).
Here is our Test image (Dori.jpg) :
- The original image size is printed (
1600x1200
). - We resize it to
(224x224)
to match the model’s input requirements.
The image must be converted into a NumPy array and expanded into a batch dimension before feeding it into the model.
img_to_array
converts the image into a numerical array.expand_dims
adds a batch dimension so that the input shape matches the model’s expectations ((1,224,224,3)
).
Now we can pass the preprocessed image into the model to get predictions.
model.predict(x)
outputs probabilities for 1,000 classes.- The result is a vector where each value represents the probability of a class.
We use Keras’ built-in function to decode the numerical prediction into a human-readable class label.
decode_predictions
translates model outputs into class names (e.g., “Shih-Tzu”).- The loop prints the top prediction along with its probability score.
Finally, we overlay the predicted label onto the original image and display it using OpenCV.
putText
writes the prediction label on the image.imshow
opens a window displaying the result.waitKey(0)
keeps the window open until a key is pressed.destroyAllWindows
closes the display window.
Here is the full code :
#Link for documentation : https://keras.io/examples/vision/image_classification_efficientnet_fine_tuning/ from tensorflow.keras.applications import EfficientNetB0 import cv2 from tensorflow.keras.preprocessing import image import numpy as np model = EfficientNetB0(weights='imagenet') # classify 1000 classes originalImg = cv2.imread("Best-image-classification-models/EfficientNet-image-Classification/Dori.jpg") print(originalImg.shape) # (1600, 1200, 3) img = cv2.resize(originalImg, (224,224)) # Model B0 requires : 224X224 input shape print(img.shape) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) pred = model.predict(x) print(pred) print(pred.shape) # get the description of the best high score prediction : from tensorflow.keras.applications.imagenet_utils import decode_predictions for name , desc , score in decode_predictions(pred , top=1)[0] : print(desc , score) #Result : Shih-Tzu 0.74700737 img = cv2.putText(originalImg , desc , (50,50) , cv2.FONT_HERSHEY_SIMPLEX , 1, (0,0,0), 2) cv2.imshow("img", originalImg) cv2.waitKey(0) cv2.destroyAllWindows()
You can download the full code here : https://ko-fi.com/s/a91cfcced2
Example Output:
Shih-Tzu 0.74700737
This means the model is 74.7% confident that the image contains a Shih-Tzu dog.
Mini Comparison (context your readers will love)
Model | Input | Params (approx) | Use-case hint |
---|---|---|---|
B0 | 224 | ~5.3M | Fast baseline, CPU/edge, small–med datasets |
B1 | 240 | ~7.8M | Slightly better accuracy, modest cost |
B3 | 300 | ~12M | Accuracy focus when you have GPU headroom |
(Use this table to explain why you chose B0 and when to scale.)
FAQ (drop-in)
Q: Can I train B0 from scratch?
A: Yes, but transfer learning from ImageNet is almost always better unless you have hundreds of thousands of labeled images.
Q: What image size should I use?
A: Stick to 224 for B0. If your subjects are tiny in the frame, try 256/288 and reduce batch size.
Q: Adam or AdamW?
A: AdamW with a small weight decay (e.g., 1e-4
) often generalizes a bit better on mid-sized datasets.
Q: Do I need heavy augmentations?
A: Keep them realistic: horizontal flip, light rotation, small color jitter, RandomResizedCrop. Too much distortion harms calibration.
Connect
☕ Buy me a coffee — https://ko-fi.com/eranfeit
🖥️ Email : feitgemel@gmail.com
🤝 Fiverr : https://www.fiverr.com/s/mB3Pbb
Planning a trip and want ideas you can copy fast?
Here are three detailed guides from our travels:
• 5-Day Ireland Itinerary: Cliffs, Castles, Pubs & Wild Atlantic Views
https://eranfeit.net/unforgettable-trip-to-ireland-full-itinerary/
• My Kraków Travel Guide: Best Places to Eat, Stay & Explore
https://eranfeit.net/my-krakow-travel-guide-best-places-to-eat-stay-explore/
• Northern Greece: Athens, Meteora, Tzoumerka, Ioannina & Nafpaktos (7 Days)
https://eranfeit.net/my-amazing-trip-to-greece/
Each guide includes maps, practical tips, and family-friendly stops—so you can plan in minutes, not hours.
Enjoy,
Eran