Jetson Nano Real-Time Image Classification with OpenCV and GoogLeNet
Introduction
In this tutorial, we’ll build a real-time wildlife video classifier using NVIDIA Jetson Inference and OpenCV in Python.
You’ll learn how to open a video file, convert frames into GPU-friendly memory, run GoogLeNet classification on every frame, and overlay the top class on the video when confidence is high.
This post fully answers the title by walking you through a clean, production-ready pattern: video I/O → GPU conversion → deep learning inference → polished on-screen results.
By the end, you’ll have a copy-paste script that runs smoothly on Jetson, plus the knowledge to swap models, tweak thresholds, and adapt it for your own datasets.
Want a hands-on detection workflow too? Explore my YOLOv8 heatmaps tutorial that visualizes model attention: Generating heatmaps with YOLOv8
You can find the full code here : https://ko-fi.com/s/7a72f61abe
Here is a video for Jetson Nano Real Time Image Classification:
The link for the video : https://youtu.be/AgOdXB34zaA
You can find more Nvidia Jetson Nano tutorials here : https://eranfeit.net/how-to-classify-objects-using-jetson-nano-inference-and-opencv/
10 Q&A
FAQ
Q: Why do we convert from BGR to RGBA before classification?
A: Jetson’s CUDA utilities and imageNet interface expect RGBA input, which ensures the frame is in the right memory layout for GPU inference.
Q: Can I change the model from GoogLeNet to something else?
A: Yes. Replace "googlenet"
with another Jetson-supported classifier like "resnet-18"
and confirm the model is available on your device.
Q: What does the confidence threshold of 0.4 achieve?
A: It filters out low-confidence predictions, reducing noisy overlays and improving readability in the output.
Q: How do I print both the label and the confidence?
A: Format the string, e.g., f"{class_desc}: {confidence:.2f}"
, and pass it to cv2.putText
.
Q: Will this handle 60 FPS video smoothly?
A: Performance depends on your Jetson model and resolution; lowering resolution or batch size and using faster models can help.
Q: Can I save the annotated video to disk?
A: Yes. Use cv2.VideoWriter
with the same frame size and FPS, and write the modified img
frames.
Q: How do I process only every Nth frame to save compute?
A: Keep a counter and run classification when frame_idx % N == 0
, otherwise just display or skip.
Q: What if the window doesn’t appear on my Jetson?
A: Ensure you’re using a desktop session with display access or use headless visualization (e.g., save frames) or jetson.utils
display APIs.
Q: Could I map certain classes to friendlier names?
A: Absolutely. Create a dictionary mapping ImageNet labels to your custom names and look them up before drawing.
Q: How do I restrict results to wildlife classes only?
A: Post-filter by a whitelist of animal labels, or fine-tune a model on your wildlife categories for best accuracy.
Bringing in libraries and opening the wildlife video
Short description: In this part, we import OpenCV and Jetson libraries, open the wildlife video, set its resolution, and load the pre-trained GoogLeNet model.
### Import the OpenCV library for computer vision tasks import cv2 ### Import Jetson Inference (models/inference) and Jetson Utils (CUDA utilities) import jetson.inference import jetson.utils ### Load the video file from disk so we can process it frame by frame cap = cv2.VideoCapture('/home/feitdemo/github/Jetson-Nano-Python/Wildlife.mp4') ### Set the capture width so frames come in at 1280 pixels wide cap.set(3,1280) ### Set the capture height so frames come in at 720 pixels tall cap.set(4,720) ### Load a pre-trained image classification model (GoogLeNet) via Jetson Inference net = jetson.inference.imageNet("googlenet")
Summary :
We prepared our tools and data source.
The webcam isn’t needed here because we’re reading a wildlife video file at 1280×720.
The model is ready to classify each frame against ImageNet categories.
If you’re comparing classic CV vs. deep learning, see this ResNet50 classification walkthrough: Alien vs Predator Image Classification with ResNet50
Converting frames for fast GPU classification
Short description: We read each frame, convert BGR→RGBA for Jetson, and move it into CUDA memory for high-performance inference.
### Keep processing frames as long as the video is open while cap.isOpened(): ### Read the next frame from the video source re, img = cap.read() ### Convert the OpenCV BGR frame into RGBA, the format Jetson expects frame_rgba = cv2.cvtColor(img,cv2.COLOR_BGR2RGBA) ### Move the frame into CUDA memory so the GPU can classify it efficiently cuda_frame = jetson.utils.cudaFromNumpy(frame_rgba)
Summary
This is the performance bridge between CPU and GPU.
Converting to RGBA and then to CUDA memory unlocks Jetson’s real-time speed.
Curious about segmentation pipelines? Check out Image segmentation with UNetR: Image segmentation with UnetR
Classifying each frame and overlaying confident predictions
Short description: We run inference, fetch the class description, draw it on frames when the confidence exceeds 40%, and display the result.
### Classify the current CUDA frame and get top class and its confidence class_id , confidence = net.Classify(cuda_frame) ### Translate the class ID into a human-readable label class_desc = net.GetClassDesc(class_id) ### If the model is at least 40% confident, overlay the label on the frame if confidence > 0.4 : cv2.putText(img, class_desc, (30,80), cv2.FONT_HERSHEY_COMPLEX, 1, (255,0,0),3 ) ### Show the annotated frame in a window named 'img' cv2.imshow('img',img) ### Position the window at the top-left of the screen for convenience cv2.moveWindow('img',0,0) ### Exit cleanly if the user presses the 'q' key if cv2.waitKey(10) & 0xFF == ord('q'): break ### Release the video capture device/file handle cap.release() ### Close all OpenCV windows created during execution cv2.destroyAllWindows()
Summary
We tied everything together: inference → label → overlay → display.
The threshold keeps labels clean and reduces flicker.
You can adjust the threshold, font, color, or position to match your use case.
Why convert frames to RGBA on Jetson?
RGBA matches Jetson’s expected format for CUDA-accelerated processing, ensuring correct memory layout and speed.
Can I use a different classifier than GoogLeNet?
Yes, swap “googlenet” with another Jetson-supported model like “resnet-18” after verifying availability on your device.
What is a good confidence threshold?
A threshold around 0.4–0.6 balances responsiveness and accuracy; adjust based on your footage and use case.
How do I display confidence with the label?
Concatenate the label with confidence, e.g., “class: 0.87”, before passing it to cv2.putText.
Can I save the annotated output video?
Yes. Use OpenCV’s VideoWriter with matching resolution and FPS to record annotated frames.
Does it run in real time on all Jetsons?
Throughput varies by model; lowering resolution or using lighter networks improves FPS on smaller devices.
How do I quit the playback window?
Press the ‘q’ key while the window is focused to exit cleanly.
Is this only for prerecorded video?
You can also use live streams or cameras by changing the VideoCapture source.
How can I filter to wildlife-only labels?
Whitelist known animal categories or fine-tune a model on your wildlife dataset for higher precision.
Can I run this headless without a display?
Yes. Skip imshow and write frames to disk, or use Jetson utilities suitable for headless environments.
You can find the full code here : https://ko-fi.com/s/7a72f61abe
Conclusion
You’ve just implemented a real-time wildlife video classifier on Jetson with OpenCV.
The pipeline demonstrates a clean approach to frame handling, GPU conversion, and high-confidence overlays, all running smoothly on edge hardware.
From here, it’s simple to swap models (e.g., ResNet-18), log predictions, save annotated videos, or restrict outputs to an animal-only whitelist.
This pattern scales to interactive exhibits, conservation dashboards, camera traps, or any project where edge AI meets high-value video content.
Prefer live cameras? Here’s a Jetson + OpenCV webcam classification variant you can adapt: Real-Time Image Classification with Jetson and OpenCV
Connect :
☕ Buy me a coffee — https://ko-fi.com/eranfeit
🖥️ Email : feitgemel@gmail.com
🤝 Fiverr : https://www.fiverr.com/s/mB3Pbb
Enjoy,
Eran