...

Real-Time Jetson Image Classification with OpenCV and GoogLeNet

Jetson Image Classification

Introduction

Building real-time Jetson image classification with OpenCV is one of the fastest ways to ship edge AI to production.
This tutorial shows how to capture frames from a webcam, convert them to CUDA, and classify them on device with GoogLeNet using jetson.inference.
Our goal is a minimal Python app that displays the live camera feed, runs inference every frame, and overlays the predicted label when confidence is high.
You will learn how to open a camera stream with OpenCV, prepare RGBA frames for GPU processing, run imageNet.Classify, and draw readable results on screen.

The link for the video : https://youtu.be/S3i7yhhw11E

You can find more Nvidia Jetson Nano tutorials here : https://eranfeit.net/how-to-classify-objects-using-jetson-nano-inference-and-opencv/

You can find more similar tutorials in my blog posts page here : https://eranfeit.net/blog/

You can find the code here : https://ko-fi.com/s/7a72f61abe

Live Image Classification on Jetson with OpenCV and GoogLeNet

This section contains a single, copy-paste-ready script.
It opens the default camera, converts frames to CUDA, runs GoogLeNet classification, and draws class names above the image when confidence is above 0.40.

# ### Import OpenCV for video capture and on-screen drawing. import cv2 # ### Import the Jetson inference module that provides pretrained classification networks like imageNet. import jetson.inference # ### Import Jetson utilities for GPU-friendly image conversions (NumPy <-> CUDA). import jetson.utils  # ### Open the default camera (index 0) for live capture. cap = cv2.VideoCapture(0) # ### Set the camera capture width to 1280 pixels for a 720p layout. cap.set(3,1280) # ### Set the camera capture height to 720 pixels. cap.set(4,720)  # ### Load the pretrained GoogLeNet classifier via jetson.inference.imageNet. net = jetson.inference.imageNet("googlenet")  # ### Process frames in a loop while the camera is available. while cap.isOpened():      # ### Read a frame from the camera as a NumPy array (BGR color order).     re, img = cap.read()          # ### Convert the BGR frame to RGBA for Jetson utilities and CUDA compatibility.     frame_rgba = cv2.cvtColor(img, cv2.COLOR_BGR2RGBA)     # ### Move the RGBA NumPy array into GPU memory as a CUDA image for fast inference.     cuda_frame = jetson.utils.cudaFromNumpy(frame_rgba)      # ### Run classification on the CUDA frame and obtain top class ID and confidence score.     class_id, confidence = net.Classify(cuda_frame)      # ### Translate the predicted class ID into a human-readable label (e.g., "coffee mug").     class_desc = net.GetClassDesc(class_id)      # ### If the prediction is confident enough (greater than 0.40), draw the label on the image.     if confidence > 0.4:         cv2.putText(img, class_desc, (30, 80), cv2.FONT_HERSHEY_COMPLEX, 1, (255, 0, 0), 3)      # ### Show the current frame in a window titled 'img'.     cv2.imshow('img', img)     # ### Position the window at the top-left corner of the screen for convenience.     cv2.moveWindow('img', 0, 0)      # ### Exit the loop when the user presses the 'q' key.     if cv2.waitKey(10) & 0xFF == ord('q'):         break  # ### Release the camera resource gracefully. cap.release() # ### Close any OpenCV display windows that may be open. cv2.destroyAllWindows() 

You can find the code here : https://ko-fi.com/s/7a72f61abe

The script captures frames from the webcam, prepares an RGBA CUDA image, classifies it with GoogLeNet, and overlays a readable label when the confidence threshold is reached.
This minimal loop is an ideal jumping-off point for adding FPS counters, multi-label overlays, or custom thresholds and filters on Jetson devices.

Connect :

☕ Buy me a coffee — https://ko-fi.com/eranfeit

🖥️ Email : feitgemel@gmail.com

🌐 https://eranfeit.net

🤝 Fiverr : https://www.fiverr.com/s/mB3Pbb

Enjoy,

Eran

error: Content is protected !!
Eran Feit