Last Updated on 13/10/2025 by Eran Feit
TensorFlow MobileNetV2 vs ResNet50: Image Classification with Pretrained Models
ResNet50 vs MobileNetV2
Deep learning offers a wide range of pre‑trained architectures for image classification. ResNet50 is part of the Residual Networks family introduced in 2015. It addresses the degradation problem encountered when training very deep networks by using residual blocks with skip connections, which allow information to flow directly from earlier layers to later onesblog.roboflow.com. These shortcut connections mitigate the vanishing‑gradient issue and enable models like ResNet50 to reach high accuracy on ImageNet, but they come with a relatively large parameter count and compute cost.
In contrast, MobileNetV2 was designed for mobile and embedded devices. It replaces standard convolutional layers with depthwise separable convolutions and introduces inverted residual blocks with linear bottlenecks, dramatically reducing the number of parameters and computationkeras.io. This architecture delivers reasonable accuracy while being lightweight and fast, making it well suited for on‑device inference.
Understanding these architectural trade‑offs helps you choose the right model: ResNet50 offers higher accuracy but requires more memory and compute, whereas MobileNetV2 sacrifices some accuracy for speed and efficiency.
Benchmark Comparison: ResNet50 vs MobileNetV2
ResNet50 and MobileNetV2 have very different capacities and compute requirements. The table below summarizes key metrics for the default ImageNet‑1K weights :
| Model | Params (M) | File size (MB) | GFLOPS | Top‑1 accuracy (%) | Top‑5 accuracy (%) |
|---|---|---|---|---|---|
| ResNet50 | 25.557 | 97.8 | 4.09 | 76.13 | 92.862 |
| MobileNetV2 | 3.504 | 13.6 | 0.30 | 71.878 | 90.286 |
ResNet50 has about seven times more parameters and floating‑point operations than MobileNetV2, and its weights file is significantly larger. This translates to higher accuracy but also higher latency and memory use. In contrast, MobileNetV2’s compact size makes it well suited to mobile and edge devices.


The charts above visualize the trade‑off: ResNet50 provides about a 4 % absolute gain in Top‑1 accuracy, but it requires roughly 14× more computation.
About this tutorial :
This tutorial shows how to perform TensorFlow image classification using Keras pretrained models—specifically MobileNetV2 and ResNet50—without any additional training with mobilenetv2 tensorflow.
You’ll learn how to load a single image, preprocess it to the expected input shape, run inference against ImageNet weights, and translate the numeric outputs into human-readable labels using decode_predictions.
We’ll begin by demonstrating the workflow with ResNet50 in principle (you’ll see the commented line in the code that toggles this), and then switch to MobileNetV2 to compare behavior, speed, and results using mobilenetv2 tensorflow.
By the end, you will have a clean, copy-paste-ready pipeline you can adapt to your own images, plus guidelines on when to pick MobileNetV2 vs ResNet50 for production or experimentation.
The link for the video : https://youtu.be/40_NC2Ahs_8&list=UULFTiWJJhaH6BviSWKLJUM9sg
Link for the full code : https://ko-fi.com/s/32570663e8
Link for my blog (more tutorials) : https://eranfeit.net/blog/
Choosing a Pretrained Model (ResNet50 vs MobileNetV2)
Introduction
Selecting the right pretrained model is a balance between inference speed, model size, and accuracy. ResNet50 is a classic high-capacity backbone with strong accuracy on ImageNet. MobileNetV2 is lightweight and optimized for speed and resource-constrained environments. The code below includes both options so you can toggle between them with a single uncomment.
Elaborated description
First, note the commented line #resnet = tf.keras.applications.resnet50.ResNet50(). This indicates Phase 1: trying ResNet50. In this phase, you would instantiate the ResNet50 model and use tf.keras.applications.resnet50.preprocess_input during preprocessing, followed by resnet.predict(...) to generate logits.
In Phase 2, we switch to MobileNetV2 with tf.keras.applications.mobilenet_v2.MobileNetV2(). This reduces parameters and speeds up inference significantly, which is especially useful on CPUs, laptops, or edge devices. The trade-off is that MobileNetV2’s top-1 accuracy is slightly below heavier backbones—but often “good enough” for many applications.
When you architect production pipelines, consider inference budget (latency), memory constraints, and expected traffic. If your service handles thousands of requests per minute or runs on devices with limited compute, MobileNetV2 often wins. If you need a bit more accuracy and can afford more compute, ResNet50 is a dependable baseline.
The key is that both integrate seamlessly via Keras Applications, so you can experiment and benchmark without rewriting your tooling. That’s what makes this tutorial portable and maintainable.

### Import TensorFlow for deep learning operations and pretrained models. import tensorflow as tf ### Import NumPy for numerical arrays and dimension manipulation. import numpy as np ### Import Matplotlib for optional visualization of the input image. import matplotlib.pyplot as plt ### Import OpenCV (optional) if you plan to do additional image I/O or transforms. import cv2 ### Import PIL helpers via Keras preprocessing utilities later for image loading. from PIL import Image ### Define the local path to the image you want to classify. filename = 'C:/GitHub/Object-Detection/TensorFlowKeras-PreTrainedModels/dog2.jpg' ### Phase 1 option: instantiate ResNet50 (commented by default to show phase switching). #resnet = tf.keras.applications.resnet50.ResNet50() ### Phase 2 option: instantiate MobileNetV2 (lightweight and fast). mobile = tf.keras.applications.mobilenet_v2.MobileNetV2() Link for the full code : https://ko-fi.com/s/32570663e8
You now have both models available. Start with ResNet50 by uncommenting the ResNet line, or stick with MobileNetV2 to prioritize speed. The remainder of the pipeline is identical except for the preprocessing function and which model’s predict you call.
Loading and Preprocessing the Image
Introduction
Pretrained ImageNet models expect images in a specific size and pixel distribution. For ResNet50 and MobileNetV2, the common input shape is (224, 224, 3). Keras provides convenient utilities to load, resize, and format images for model consumption.
Elaborated description
We use image.load_img(..., target_size=(224,224)) to ensure the image matches the model’s input resolution. Then image.img_to_array(img) converts the PIL image to a NumPy array so TensorFlow can process it.
Neural networks operate on batches of images. Even if you classify a single image, we expand its dimensions from (224, 224, 3) to (1, 224, 224, 3) using np.expand_dims(..., axis=0).
Crucially, each model family expects a specific preprocessing routine. ResNet50 and MobileNetV2 use different normalization schemes. For ResNet50, use tf.keras.applications.resnet50.preprocess_input. For MobileNetV2, use tf.keras.applications.mobilenet_v2.preprocess_input. Mixing these up can degrade predictions.
By keeping both preprocessing lines in the code (with one commented), you can toggle models safely and reproducibly.
### Import Keras image utilities for loading and converting images. from tensorflow.keras.preprocessing import image ### Load the image from disk and resize to the expected 224x224 resolution. img = image.load_img(filename, target_size=(224, 224)) # the model works with 224x224 resolution ### Convert the PIL image to a NumPy array (H, W, C). resizedImage = image.img_to_array(img) ### Print the shape to verify the resize step worked as expected. print("Risized image shape") print(resizedImage.shape) ### Expand dimensions so the array represents a batch of size 1 (N, H, W, C). imageWithMoreDimantion = np.expand_dims(resizedImage, axis=0) ### Print the new shape to confirm batching. print("imageWithMoreDimantion image shape") print(imageWithMoreDimantion.shape) ### PREPROCESS for ResNet50 (use this if you enabled ResNet50 above). #finalImage = tf.keras.applications.resnet50.preprocess_input(imageWithMoreDimantion) ### PREPROCESS for MobileNetV2 (use this if you enabled MobileNetV2 above). finalImage = tf.keras.applications.mobilenet_v2.preprocess_input(imageWithMoreDimantion) Link for the full code : https://ko-fi.com/s/32570663e8
Your single image is now a properly shaped, normalized batch ready for inference. Remember: match the preprocessing to the model family—that small detail preserves accuracy.
Running Inference and Decoding Predictions
Introduction
With the preprocessed tensor, inference is a single function call. The model outputs class probabilities over 1,000 ImageNet categories. We then use decode_predictions to map the top scores into readable labels.
Elaborated description
For ResNet50, call resnet.predict(finalImage). For MobileNetV2, call mobile.predict(finalImage). Both return a (1, 1000) array of scores.decode_predictions converts those scores into (class_id, class_name, probability) tuples, sorted by confidence. This helps you quickly interpret whether the model recognized your object (e.g., a dog breed) correctly.
If you want more granular control, you can inspect predictions[0], find np.argmax, or adjust top-k results to show more candidates.
In production, you might log raw scores for auditing and track top-k accuracy across diverse inputs to monitor model drift.
### Run inference using the selected model. ### If you enabled ResNet50 above, use: #predictions = resnet.predict(finalImage) ### If you enabled MobileNetV2 above, use: predictions = mobile.predict(finalImage) ### Import ImageNet utilities for mapping numeric outputs to human-readable labels. from tensorflow.keras.applications import imagenet_utils ### Decode the predictions to get (class_id, class_name, probability) tuples. results = imagenet_utils.decode_predictions(predictions) ### Print the decoded results to the console. print('Results :') print(results) Link for the full code : https://ko-fi.com/s/32570663e8
You now have interpretable predictions from a pretrained ImageNet model. The top entries should reflect the object category in your image. If confidence is low, try a clearer image or experiment with the other backbone.
Visualizing the Input (and Interpreting Results)
Introduction
Although the model doesn’t require visualization, plotting the input helps verify that you loaded and resized the correct image. It’s also useful when you iterate on multiple test images and want quick visual feedback.
Elaborated descriptionplt.imshow(img) displays the resized image that was passed to the model. If colors appear off, verify your image loading and preprocessing.
For reporting, you can overlay the top prediction on the plot title (e.g., plt.title(results[0][0][1])) to make screenshots self-contained.
If you plan further post-processing—like drawing bounding boxes or saliency maps—you’ll likely leverage OpenCV or Grad-CAM utilities, which plug neatly into this pipeline.
Finally, consider saving both the raw predictions and the plotted image for reproducibility and to build a lightweight evaluation set over time.
### Visualize the input image to confirm correct loading and resizing. plt.imshow(img) ### Render the plot window (or inline cell) so you can see the image. plt.show() Link for the full code : https://ko-fi.com/s/32570663e8
A quick visualization step improves trust and debuggability. It ensures what you think you classified is exactly what the model saw.
Batch Classification & tf.data Pipeline
The original tutorial processes a single image. If you need to classify many images (e.g. a folder of test photos), build a batch pipeline with tf.data:
import tensorflow as tf from tensorflow import keras from tensorflow.keras.preprocessing import image_dataset_from_directory # Load images from a directory; each subfolder should be a class dataset = image_dataset_from_directory( 'path_to_images', image_size=(224, 224), batch_size=32, labels=None # no labels needed for inference ) # Select model and preprocessing function model = keras.applications.MobileNetV2(weights='imagenet') preprocess_fn = keras.applications.mobilenet_v2.preprocess_input # Preprocess each batch and run predictions dataset = dataset.map(lambda x: preprocess_fn(x)) predictions = model.predict(dataset) Transfer Learning & Fine‑Tuning
Using pretrained models for inference is valuable, but you can get better results on your own dataset by transfer learning. The typical workflow is:
- Load the pretrained model and freeze its layers so their weights don’t change.
- Add new trainable layers on top (e.g. a global pooling layer and a dense classifier).
- Train the new layers on your dataset.
- Optionally, fine‑tune: unfreeze some or all of the base model and continue training with a very low learning rate
Example (simplified for clarity):
from tensorflow import keras from tensorflow.keras import layers, optimizers # Load base model without top layers base_model = keras.applications.ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3)) base_model.trainable = False # freeze base layers inputs = keras.Input(shape=(224, 224, 3)) x = keras.applications.resnet.preprocess_input(inputs) x = base_model(x, training=False) x = layers.GlobalAveragePooling2D()(x) x = layers.Dropout(0.3)(x) outputs = layers.Dense(num_classes, activation='softmax')(x) model = keras.Model(inputs, outputs) model.compile(optimizer=optimizers.Adam(learning_rate=1e-3), loss='sparse_categorical_crossentropy', metrics=['accuracy']) model.fit(train_ds, validation_data=val_ds, epochs=5) # Fine‑tune: unfreeze base layers and retrain with low LR base_model.trainable = True model.compile(optimizer=optimizers.Adam(learning_rate=1e-5), loss='sparse_categorical_crossentropy', metrics=['accuracy']) model.fit(train_ds, validation_data=val_ds, epochs=5) Fine‑tuning can yield incremental improvements by gently adapting pretrained features to your datakeras.io. Remember to keep BatchNormalization layers in inference mode during fine‑tuning to prevent them from destroying learned statistics
FAQ
ResNet50 is a deeper, high-accuracy model using residual connections, while MobileNetV2 is lightweight and fast thanks to depthwise separable convolutions and inverted residual blocks.
MobileNetV2 was designed for efficiency on low-power hardware and is ideal for mobile, IoT, or edge deployments. ResNet50 is heavier and better suited to GPUs or servers.
Yes. Each Keras model provides its own preprocess_input() function that normalizes data in the expected way. Always use the function from the same module as your chosen model.
224×224 pixels is standard for both ResNet50 and MobileNetV2 pretrained weights. Larger images can improve accuracy at the cost of more compute and memory use.
Freeze the base model, add a new classification head, train it, then unfreeze selected layers and train again with a very small learning rate. This preserves useful pretrained features while adapting to your data.
These errors occur when input sizes don’t match the model’s expected shape. Always ensure 3 color channels and the correct input dimension for your architecture and whether include_top is True or False.
BatchNorm layers can destabilize training on small datasets if updated. Keep them in inference mode or unfreeze only a few blocks at a time for more stable results.
Convert models to TensorFlow Lite or ONNX, use integer quantization, and reduce input resolution. MobileNetV2 is already optimized for fast inference on CPUs and small devices.
Track F1-score, Top-1/Top-5 accuracy, latency, and memory usage. These give a more complete view of performance, especially for real-time or imbalanced tasks.
Use confusion matrices, Grad-CAM heatmaps, or per-class accuracy plots to understand predictions. Visualization helps identify bias or recurring errors in your dataset.
Practical Comparison: When to Use ResNet50 vs MobileNetV2
Both backbones are strong baselines for tensorflow image classification. Your choice depends on deployment constraints and accuracy requirements.
Elaborated description
- Speed & Size: MobileNetV2 is typically faster and smaller—ideal for laptops, CPUs, and edge devices.
- Accuracy: ResNet50 often edges out MobileNetV2 on top-1 accuracy—useful when you can afford extra compute.
- Preprocessing: Always use the matching preprocessing function to avoid silent performance drops.
- Production fit: Benchmark both on your target hardware and real images. Latency budgets, throughput targets, and energy constraints usually decide the winner.
Connect :
☕ Buy me a coffee — https://ko-fi.com/eranfeit
🖥️ Email : feitgemel@gmail.com
🤝 Fiverr : https://www.fiverr.com/s/mB3Pbb
Enjoy,
Eran
