Easy Detectron2 Object Detection Tutorial for Beginners

Leave a Comment / Object Detection, Pytorch

Last Updated on 13/11/2025 by Eran Feit

Getting comfortable with a Detectron2 object detection tutorial

A detectron2 object detection tutorial is all about turning raw images into meaningful, labeled scenes using one of Facebook AI’s most powerful computer vision libraries.
Instead of hand-writing complex neural networks from scratch, you stand on the shoulders of pre-trained models like Faster R-CNN and ResNet, and focus on how to prepare your environment, configure the model, and run fast, reliable inference on your own images.

At a high level, the goal is simple: load an image, pass it through a Detectron2 model, and get back bounding boxes and class labels that tell you what is in the scene and where it is.
Behind the scenes, though, there’s a lot going on.
You need to install PyTorch, compile Detectron2, choose a model from the Model Zoo, and fine-tune configuration options like confidence thresholds, device selection (CPU vs GPU), and visualization settings.

A good tutorial guides you through each of these steps with clear, reproducible commands.
You start by creating a clean Conda environment, installing PyTorch and Detectron2 on Windows or WSL, and opening the project in VS Code so your Python code runs in the right context.
Then you import the core libraries (PyTorch, OpenCV, Detectron2 modules), load an image, resize it for faster processing, and build a configuration object that points to a specific Model Zoo configuration and its pre-trained weights.

Once everything is wired together, a single call to DefaultPredictor runs the entire detection pipeline for you.
The model reads the image, extracts multi-scale features, proposes regions of interest, refines bounding boxes, and outputs both the classes and coordinates of the detected objects.
Using Detectron2’s Visualizer, you can overlay those predictions directly on top of the original image, creating an intuitive, color-coded result window that makes it easy to see if the model is working well or needs further tuning or training on custom data.

structure of detectron2 — Easy Detectron2 Object Detection Tutorial for Beginners 5

If you enjoy this Detectron2 object detection tutorial and want to go beyond bounding boxes, check out my guide on Easy Detectron2 Instance Segmentation for pixel-perfect masks on your own images.

Custom diagram concept for this tutorial

[Input Image]       │       ▼ [Resize with OpenCV]       │       ▼ [Detectron2 Config]  (COCO-Detection/   faster_rcnn_R_101_FPN_3x.yaml)       │       ▼ [DefaultPredictor]  (PyTorch + Model Zoo Weights)       │       ▼ [Detections]  (Classes + Bounding Boxes)       │       ▼ [Visualizer Overlay]   └─> [Output Window: "predict"]

Let’s walk through this Detectron2 object detection tutorial code

In this Detectron2 object detection tutorial, the code is built to take you from raw setup all the way to seeing real object detections drawn on top of an image.
The main target of the code is to show how to install Detectron2 in a clean environment, load a pre-trained model from the Model Zoo, and run object detection on a single image with clear visual results.
Instead of getting lost in theory, the script focuses on a practical, end-to-end workflow that you can reuse for your own projects on Windows, CPU, or WSL.

The first part of the tutorial code is all about preparation.
You create a dedicated Conda environment, install the right PyTorch and Detectron2 versions, and make sure everything is wired correctly in Visual Studio Code.
There is also a separate setup for WSL, which is helpful if you want to run Detectron2 in a more Linux-like environment or take advantage of CUDA on a supported GPU.
By isolating your environment this way, you reduce package conflicts and make it easier to debug or upgrade later.

The Python section of the code starts by importing the essential libraries: PyTorch, Detectron2 modules, NumPy, and OpenCV.
You then point to a sample image, read it with OpenCV, and resize it to a smaller percentage to speed up inference and reduce memory usage.
This resizing step is especially practical when you are testing on a CPU-only setup or on high-resolution photos, because it keeps the tutorial fast and responsive while you experiment.

Next, the script builds a configuration object using get_cfg() and merges it with a pre-defined Faster R-CNN model from the COCO-Detection section of the Detectron2 Model Zoo.
The code sets the confidence threshold for predictions, loads the correct pre-trained weights, and explicitly chooses the device as "cpu" so it works even without a GPU.
With this configuration in place, a DefaultPredictor object is created, and a single line runs the model on the resized image to produce class labels and bounding boxes.
Finally, the tutorial uses the Visualizer utility to draw these predictions on top of the image and opens two windows: one for the original resized image and another for the detected objects, giving you an immediate, visual understanding of how well Detectron2 is performing.

You can watch the tutorial here : https://youtu.be/VKiYGmkmQMY

Code for the tutorial here : https://eranfeit.lemonsqueezy.com/buy/97e13b97-0d84-427d-a74e-94e301c8a90b

or here : https://ko-fi.com/s/cdf53ba991

Link for Medium users : https://medium.com/object-detection-tutorials/easy-detectron2-object-detection-tutorial-for-beginners-a7271485a54b

You can follow my blog here : https://eranfeit.net/blog/

Want to get started with Computer Vision or take your skills to the next level ?

If you’re just beginning, I recommend this step-by-step course designed to introduce you to the foundations of Computer Vision – Complete Computer Vision Bootcamp With PyTorch & TensorFlow

If you’re already experienced and looking for more advanced techniques, check out this deep-dive course – Modern Computer Vision GPT, PyTorch, Keras, OpenCV4

Getting comfortable with this Detectron2 object detection tutorial

In this Detectron2 object detection tutorial, we are going to build a complete, practical pipeline that runs object detection on a single image using a pre-trained Faster R-CNN model from the Detectron2 Model Zoo.
Instead of starting from the theory of convolutional networks, we focus on the steps that actually get you from “empty environment” to “bounding boxes drawn on a real image” on your own machine.

The main target of the code is to show how to install Detectron2 in a clean Conda environment, configure it to run on CPU or through WSL, and then apply a ready-made COCO model to detect objects.
You will see how to load an image with OpenCV, resize it for faster inference, build a configuration object, and use DefaultPredictor to run detection in one line of Python.

Along the way, we keep everything beginner friendly while still using the same building blocks professionals rely on.
You will work with the Detectron2 Model Zoo, confidence thresholds, pre-trained weights, and the Visualizer utility that overlays predictions on the image.
By the end, you will not only have working code, but also a clear mental map of what each part does and how to adapt it to your own images or datasets.

This tutorial is especially useful if you are working on Windows, want to experiment with WSL and CUDA later, or simply want an easy Detectron2 object detection tutorial you can copy, paste, and tweak.
The code is intentionally structured in clear parts so you can return later, grab only what you need, and plug it into other projects.

Let’s walk through the Detectron2 object detection tutorial code

The code for this Detectron2 object detection tutorial is organized as a simple end-to-end script.
First, you set up your environment on Windows and optionally WSL, install PyTorch and Detectron2, and then open the folder in Visual Studio Code.

Then you jump into the Python section and import your core libraries.
You load an image from disk with OpenCV, scale it down so inference is snappy even on CPU, and store it in a NumPy array.
This resized image is what you send into the model.

Next comes configuration.
Using get_cfg(), you start from a blank Detectron2 config and merge in a specific Faster R-CNN model from the COCO-Detection section of the Model Zoo.
You set the confidence threshold, load the pre-trained weights, and tell Detectron2 to run on CPU.

Finally, you create a DefaultPredictor with this configuration and pass the image into it.
The predictor returns instances with predicted classes and bounding boxes, which you print for inspection.
Using the Visualizer, you draw these predictions on top of the image and display both the original and annotated versions in OpenCV windows.

Setting up your Detectron2 environment on Windows

This first part focuses on creating a clean Conda environment on Windows, installing PyTorch in CPU-only mode, and preparing all dependencies you need for Detectron2.
This keeps your Detectron2 object detection tutorial reproducible and avoids conflicts with other projects.

### Create a new Conda environment named "detectorn2" with Python 3.9 so Detectron2 has its own isolated setup. conda create -n detectorn2 python=3.9  ### Activate the "detectorn2" environment so all upcoming installs go into this dedicated space. conda activate detectorn2  ### Install PyTorch 1.13.0, torchvision, and torchaudio with CPU-only support from the official PyTorch channel. conda install pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 cpuonly -c pytorch  ### Optionally, install an older PyTorch 1.10.1 stack for compatibility with some setups if needed. # conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cpuonly -c pytorch  ### Install the Ninja build system to speed up compilation steps Detectron2 may require. pip install ninja  ### Install a compatible version of PyYAML used by Detectron2 configuration files and other utilities. pip install pyyaml==5.1  ### Install pywin32 so Python can interface smoothly with Windows-specific APIs where needed. pip install pywin32  ### Install OpenCV for Python so you can load, resize, and display images in your object detection pipeline. pip install opencv-python  ### Install fvcore, a core utility library developed by Facebook Research that Detectron2 depends on. pip install git+https://github.com/facebookresearch/fvcore  ### Install Cython, which helps compile some of the optimized extensions required by COCO and Detectron2. pip install cython  ### Install the COCO API for Python from a maintained Git repository to support COCO-style datasets and evaluation. pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI

This Windows setup gives you a ready-to-use environment for running Detectron2 on CPU.
It is ideal for following this Detectron2 object detection tutorial step by step before switching to GPU later.

Cloning and building Detectron2 on Windows and WSL

Now we clone the Detectron2 repository, build it from source, and set things up both for classic Windows and for a WSL environment.
This part ensures that the same Detectron2 object detection tutorial code can run in either place with minimal changes.

### Choose or create a folder on disk where you want to keep the Detectron2 source code. # For example: cd C:/temp or any workspace directory you prefer.  ### Clone the official Detectron2 repository so you have the latest source locally. git clone https://github.com/facebookresearch/detectron2.git  ### Move into the newly cloned Detectron2 folder so build commands run in the correct path. cd detectron2  ### Build and install Detectron2 in "editable" mode so your environment can import it as a Python package. python setup.py build develop  ### Important reminder to open Visual Studio Code with this Detectron2 folder as the working directory. # Later, you will open VS Code and select this directory as your project root.   ### ----------------------------- ### WSL (Linux) specific workflow ### -----------------------------  ### Create a new Conda environment inside WSL specifically for running Detectron2 on Ubuntu. conda create -n detectorn2Wsl python=3.9  ### Activate the WSL environment so PyTorch and Detectron2 installations are contained here. conda activate detectorn2Wsl  ### Install a GPU-enabled PyTorch build with CUDA 11.6 support using the PyTorch and NVIDIA channels. conda install pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.6 -c pytorch -c nvidia  ### Create a fresh folder inside WSL to hold your Detectron2 source checkout. # mkdir wsl # cd wsl  ### Clone the same Detectron2 repository inside your WSL environment to build it on Linux. git clone https://github.com/facebookresearch/detectron2.git  ### Enter the Detectron2 folder so you can run the build script. cd detectron2  ### Build and install Detectron2 in develop mode under WSL so Python inside WSL can import it. python setup.py build develop  ### Launch Visual Studio Code from this folder so your editor is attached directly to the WSL Detectron2 workspace. code .

With this step, you have Detectron2 installed and ready both on native Windows (CPU) and on WSL (GPU capable).
That means the rest of this Detectron2 object detection tutorial can run smoothly wherever you prefer to work.

Loading and resizing the image for Detectron2

In this section, you import core libraries and prepare the input image.
Resizing the image makes your Detectron2 object detection tutorial more responsive, especially when running on CPU.

Here is the test Image :

Test Image for segmentation

### Import PyTorch to ensure the deep learning backend is available and working correctly. import torch  ### Import the main Detectron2 package so you can access models, config utilities, and tools. import detectron2  ### Import NumPy to handle arrays and numerical operations that back your image data. import numpy as np  ### Import OpenCV to read images from disk, resize them, and display visual results. import cv2  ### Import the Detectron2 model zoo helper to easily load predefined configurations and checkpoints. from detectron2 import model_zoo  ### Import the DefaultPredictor class, which wraps preprocessing, model loading, and inference into one object. from detectron2.engine import DefaultPredictor  ### Import the configuration helper so you can build and customize Detectron2 config objects. from detectron2.config import get_cfg  ### Import the Visualizer utility to draw bounding boxes and labels on top of your images. from detectron2.utils.visualizer import Visualizer  ### Import MetadataCatalog to access dataset-specific class names and colors for visualization. from detectron2.data import MetadataCatalog  ### Define the path to the image you want to run object detection on. imagePath = "Best-Object-Detection-models/Detectron2/Simple-Object_detection/pexels-brett-sayles-1115171.jpg"  ### Read the image from disk using OpenCV so it becomes a NumPy array in BGR format. img = cv2.imread(imagePath)  ### Choose a scaling percentage to reduce the image size and speed up model inference. scale_precent = 30  ### Compute the new width as a percentage of the original width based on the chosen scale. width = int(img.shape[1] * scale_precent / 100)  ### Compute the new height as a percentage of the original height based on the chosen scale. height = int(img.shape[0] * scale_precent / 100)  ### Store the new dimensions in a tuple that OpenCV expects for resizing. dim = (width, height)  ### Resize the original image to the new dimensions using an interpolation method suitable for shrinking. myNewImage = cv2.resize(img, dim, interpolation=cv2.INTER_AREA)

After this step, you have a clean, resized image ready for your Detectron2 object detection tutorial.
The smaller size makes inference faster while still preserving enough detail for meaningful detections.

Once you are comfortable loading pretrained models from the Detectron2 model zoo, the next step is often training on your own dataset. You can follow my step-by-step post Detectron2 Custom Dataset Training Made Easy to register COCO-style data and fine-tune Mask R-CNN on your specific objects.

Configuring the Detectron2 model and predictor

Now we configure Detectron2 by starting from a base configuration and merging in a specific Faster R-CNN model from the COCO-Detection section of the Model Zoo.
This is the heart of the Detectron2 object detection tutorial because it defines which pre-trained model you use and how predictions behave.

### Create a fresh Detectron2 configuration object that you will customize for object detection. cfg_keypoint = get_cfg()  ### Merge a predefined Faster R-CNN configuration from the COCO Detection section of the model zoo into your config. cfg_keypoint.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_101_FPN_3x.yaml"))  ### Set the minimum score threshold for predicted bounding boxes so only confident detections are kept. cfg_keypoint.MODEL.ROI_HEADS.SCORE_TRESH_TEST = 0.5  # higher means better probability  ### Load the pre-trained model weights that correspond to the same Faster R-CNN configuration from the model zoo. cfg_keypoint.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_101_FPN_3x.yaml")  ### Explicitly set the device to CPU so the model runs even without a GPU, which is great for quick tests. cfg_keypoint.MODEL.DEVICE = "cpu"  # if you run on Linux and have Cuda, you can delete this line  ### Create a DefaultPredictor object that bundles preprocessing, model loading, and inference using your configuration. predictor = DefaultPredictor(cfg_keypoint)

With just a few lines, you now have a fully configured Detectron2 predictor running a strong Faster R-CNN model.
This keeps your Detectron2 object detection tutorial simple while still leveraging a powerful architecture.

Running inference and visualizing Detectron2 predictions

The final part of the tutorial runs object detection, prints raw predictions, and uses the Visualizer to draw bounding boxes and labels on the image.
This is where your Detectron2 object detection tutorial becomes visual and intuitive.

### Run the predictor on the resized image to obtain object detection predictions. outputs = predictor(myNewImage)  ### Print the predicted class indices for each detected instance so you can inspect what the model found. print(outputs["instances"].pred_classes)  ### Print the predicted bounding boxes for each detected instance so you can see the coordinates of each detection. print(outputs["instances"].pred_boxes)  ### Create a Visualizer object with the RGB version of the image and the training dataset metadata for class names and colors. v = Visualizer(myNewImage[:, :, ::-1], MetadataCatalog.get(cfg_keypoint.DATASETS.TRAIN[0]), scale=1.2)  ### Draw the instance predictions on top of the image by converting all prediction tensors to CPU. out = v.draw_instance_predictions(outputs["instances"].to("cpu"))  ### Convert the visualized image back from RGB to BGR format so OpenCV can display it correctly. img = out.get_image()[:, :, ::-1]  ### Show the resized original image in an OpenCV window so you can compare before and after. cv2.imshow("img", myNewImage)  ### Show the image with Detectron2 predictions overlaid so you can see bounding boxes and labels. cv2.imshow("predict", img)  ### Wait indefinitely for a key press so the windows stay open until you decide to close them. cv2.waitKey(0)  ### Destroy all OpenCV windows to cleanly close the visual display when you are done. cv2.destroyAllWindows()

Once you run this code, you will see two windows.
One shows the resized original image.
The other shows the same image with bounding boxes and labels from Detectron2 drawn on top.
This completes the core Detectron2 object detection tutorial pipeline.

If you are curious about richer scene understanding beyond standard detection, explore my tutorials on Detectron2 Panoptic Segmentation Made Easy For Beginners and YOLOv8 Segmentation Tutorial for Multi-Class Football to compare different segmentation workflows.

FAQ :

What is Detectron2 and why use it for object detection?

Detectron2 is a PyTorch-based library from Meta AI that implements modern detection and segmentation models. It lets you run powerful object detectors with concise, readable Python code.

Do I need a GPU for this Detectron2 object detection tutorial?

No, this tutorial is designed to run on CPU by setting the device to “cpu” in the configuration. A GPU makes inference faster, but it is optional for the basic example.

Which Detectron2 model are we using in this tutorial?

We use a Faster R-CNN model with a ResNet-101 FPN backbone from the COCO-Detection section of the Detectron2 Model Zoo. It offers a strong balance between accuracy and robustness on everyday images.

Can I switch to a different Detectron2 model from the Model Zoo?

Yes, you can change the configuration string to any other model from the Model Zoo, such as a ResNet-50 backbone. The rest of the pipeline stays almost the same.

What does the SCORE_TRESH_TEST parameter do?

SCORE_TRESH_TEST sets the minimum confidence score needed for a detection to be shown. Increase it to see fewer, more confident boxes, or decrease it to see more, including uncertain ones.

How do I run this tutorial on my own image files?

Update the imagePath variable to point to your image and ensure the path is valid. The predictor and visualization code will work the same for any compatible image.

Why do we resize the image before running Detectron2?

Resizing reduces the resolution so inference is faster and uses less memory. It is especially helpful when working on CPU or with very large source images.

What should I do if Detectron2 fails to build?

Check that your PyTorch version matches Detectron2 requirements and that build tools like compilers are installed. Using a clean Conda environment often resolves conflicting dependency issues.

Can I save the annotated output image to disk?

Yes, after drawing predictions you can call cv2.imwrite with a file name and the annotated image. This lets you keep a permanent copy of your detection results.

What are good next steps after this Detectron2 object detection tutorial?

Next steps include training on custom datasets, exploring instance or panoptic segmentation, and integrating Detectron2 into larger applications or APIs for real projects.

Conclusion

In this easy Detectron2 object detection tutorial for beginners, you built a complete pipeline from scratch.
You created clean Conda environments on Windows and WSL, installed PyTorch and Detectron2 correctly, and cloned the repository so everything stays organized and reproducible.

On the Python side, you learned how to import the key Detectron2 modules, load and resize an image with OpenCV, and prepare that image as input for a powerful Faster R-CNN model from the COCO-Detection Model Zoo.
The configuration step showed you how to adjust confidence thresholds, select devices, and load pretrained weights with just a few lines of code.

You then used DefaultPredictor to perform inference in a single call, extracted predicted classes and bounding boxes, and visualized the results with the built-in Visualizer.
Seeing bounding boxes and labels drawn directly on the image makes it clear that the entire pipeline is working as intended and ready for experimentation.

From here, you can extend this tutorial in many directions.
You can batch-process folders of images, save predictions to disk, connect this code to a web API, or move on to instance and panoptic segmentation with Detectron2 or YOLO-based models.
Whatever path you choose, you now have a solid, working Detectron2 object detection tutorial as a foundation for your future computer vision projects.

Connect :

☕ Buy me a coffee — https://ko-fi.com/eranfeit

🖥️ Email : feitgemel@gmail.com

🌐 https://eranfeit.net

🤝 Fiverr : https://www.fiverr.com/s/mB3Pbb

Enjoy,

Eran

Leave a Comment Cancel Reply