Getting Started with YOLOX for Object Detection

Leave a Comment / Object Detection

Last Updated on 17/11/2025 by Eran Feit

YOLOX

Getting to know YOLOX for modern object detection

YOLOX is a high-performance object detection model that builds on the YOLO family while rethinking some of its core design choices. Instead of relying on predefined anchor boxes, YOLOX uses an anchor-free approach, which simplifies the detection head and makes training and deployment easier in real-world projects. This design helps YOLOX strike a strong balance between speed and accuracy, making it a solid choice when you need real-time object detection that still delivers competitive results on benchmarks like COCO.

At a high level, the goal of YOLOX is to bridge the gap between cutting-edge research and industrial applications. The model incorporates several modern techniques, such as a decoupled detection head (separating classification and regression branches) and an advanced label-assignment strategy called SimOTA. Together, these updates improve how the network learns which predictions correspond to which ground-truth objects, leading to more stable training, cleaner bounding boxes, and higher average precision across different model sizes.

The YOLOX family includes multiple variants so you can choose the right trade-off for your project. Lightweight models like YOLOX-Nano and YOLOX-Tiny are designed for edge devices and resource-constrained deployments, while larger versions such as YOLOX-s, YOLOX-m, YOLOX-l, and YOLOX-x push accuracy higher for GPU-powered servers. Despite their different sizes, all of these models share the same anchor-free design and modern training tricks, which is why YOLOX can exceed earlier YOLO versions like YOLOv3–v5 on the COCO benchmark in both speed–accuracy trade-off and overall performance.

Another key strength of YOLOX is its flexible deployment story. The official implementation is written in PyTorch, but the ecosystem goes far beyond a single framework. YOLOX models can be exported and accelerated using ONNX Runtime, TensorRT, ncnn, OpenVINO, and other runtimes, making it practical to run the same detector on servers, desktops, and embedded hardware. This makes YOLOX a practical option if your target is a full object detection pipeline—from experimentation in Python to production inference on optimized backends.

In short, YOLOX is designed for people who want modern object detection without getting lost in overly complex architectures. By combining an anchor-free design, a decoupled head, smart label assignment, and a rich model zoo, YOLOX gives you a flexible toolkit for detecting objects in images and videos with strong, reliable performance. Whether you are running a quick demo on a single image or building a large-scale application, YOLOX provides a solid foundation to get started with object detection in a practical, production-ready way.

If you enjoy working with modern object detectors like YOLOX, you might also like my step-by-step guide on SSD MobileNet v3 object detection explained for beginners , where we build another COCO-based detector using OpenCV.

YoloX

Walking through our YOLOX tutorial step by step

This tutorial code is all about getting a practical, working YOLOX object detector running as quickly as possible.
Instead of starting from theory or complex configuration files, the code walks you through creating a fresh Conda environment, installing the YOLOX library, and downloading ready-made pretrained weights.
The goal is simple and hands-on: by the end of the script, you run one command, pass in an example image, and see YOLOX draw bounding boxes around the objects it finds.

The first part of the code focuses on a clean setup, especially on Windows using WSL.
You create a dedicated YoloX Conda environment with Python 3.7 so all required libraries sit in one place and do not interfere with other projects.
Then you clone the official YOLOX repository from GitHub, move into the project folder, and install the dependencies from requirements.txt, including an editable install of the YOLOX package itself.
This section makes sure that when you later run the demo command, everything from PyTorch to the YOLOX helpers is already configured and ready.

Next, the code prepares the model side of the tutorial.
You create a weights folder and download the yolox_x.pth file, which contains the pretrained YOLOX-X model trained on the COCO dataset.
YOLOX-X is the largest of the standard YOLOX models, offering the best accuracy, at the cost of more parameters and longer inference time.
By using these COCO weights, the code can immediately recognize 80 common object categories such as person, car, boat, and many others, without any additional training.

The final part of the code is where the actual object detection happens.
You copy an example image, like boat.jpg, into the assets folder and then call tools/demo.py in image mode, telling YOLOX which model (yolox-x), which weights file, and which input image to use.
Command-line arguments such as --conf 0.3 and --nms 0.65 control the confidence threshold and non-maximum suppression, so the detector only keeps reliable boxes and removes heavy duplicates around the same object.
The --tsize 640 parameter sets the input resolution, and --device gpu lets you run YOLOX on your GPU for faster results.
Once the script finishes, it saves the detection output under YOLOX_outputs/yolox_x, where you can open the image and immediately see how YOLOX has labeled the boat and people in the scene.

Overall, this tutorial code is designed as a gentle yet powerful introduction to YOLOX.
It shows you how to move from a blank environment to a fully working object detection pipeline using pretrained models, step by step.
After you understand this flow, it becomes much easier to extend the same pattern to new images, video streams, or even custom datasets while still relying on the strong foundation that YOLOX provides.

link for the video tutorial : https://youtu.be/GOpiP05f26k

Link for the tutorial files and code : https://eranfeit.lemonsqueezy.com/buy/eacb0e32-bff4-4b18-9112-e806fa55ce7e or here : https://ko-fi.com/s/9f5b5625bd

Link for Medium users : https://medium.com/@feitgemel/getting-started-with-yolox-for-object-detection-0a3aaa2b083f

You can follow my blog here : https://eranfeit.net/blog/

Want to get started with Computer Vision or take your skills to the next level ?

If you’re just beginning, I recommend this step-by-step course designed to introduce you to the foundations of Computer Vision – Complete Computer Vision Bootcamp With PyTorch & TensorFlow

If you’re already experienced and looking for more advanced techniques, check out this deep-dive course – Modern Computer Vision GPT, PyTorch, Keras, OpenCV4

Getting started with YOLOX for object detection

YOLOX is a modern object detection framework that takes the classic YOLO idea and gives it a fresh, anchor-free design.
Instead of relying on predefined anchor boxes and a lot of hand-crafted rules, YOLOX simplifies the detection head and focuses on clean, end-to-end learning.
This makes the model easier to train, easier to debug, and often more accurate than earlier YOLO versions on benchmarks like COCO.

The main goal of YOLOX is to bridge the gap between research and real-world projects.
It includes a family of models from very small to very large, so you can choose the right balance between speed and accuracy for your hardware.
Whether you are prototyping on a laptop or deploying on a GPU server, there is a YOLOX variant that fits your constraints.

In this tutorial we focus on using YOLOX-X, the largest and most accurate of the standard YOLOX models.
YOLOX-X is trained on the COCO dataset, which means it already knows how to detect 80 everyday object categories like person, car, boat, and dog.
By downloading these pretrained weights you can jump straight into detection without touching any training code.

The code below walks you from a blank environment to a working YOLOX object detector in a few clear steps.
You will create a dedicated Conda environment, clone the official YOLOX repository, install dependencies, download COCO weights, and finally run detection on a test image.
By the end, you will have a simple but powerful YOLOX object detection pipeline that you can adapt to your own images and projects.

Walking through our YOLOX tutorial step by step

This tutorial code is all about getting a practical, working YOLOX object detector running as quickly as possible.
Instead of starting from theory or complex configuration files, the code walks you through creating a fresh Conda environment, installing the YOLOX library, and downloading ready-made pretrained weights.
The goal is simple and hands-on.
By the end of the script, you run one command, pass in an example image, and see YOLOX draw bounding boxes around the objects it finds.

The first part of the code focuses on a clean setup, especially on Windows using WSL.
You create a dedicated YoloX Conda environment with Python 3.7 so all required libraries sit in one place and do not interfere with other projects.
Then you clone the official YOLOX repository from GitHub, move into the project folder, and install the dependencies from requirements.txt, including an editable install of the YOLOX package itself.
This section makes sure that when you later run the demo command, everything from PyTorch to the YOLOX helpers is already configured and ready.

Next, the code prepares the model side of the tutorial.
You create a weights folder and download the yolox_x.pth file, which contains the pretrained YOLOX-X model trained on the COCO dataset.
YOLOX-X is the largest of the standard YOLOX models, offering the best accuracy, at the cost of more parameters and longer inference time.
By using these COCO weights, the code can immediately recognize 80 common object categories without any additional training.

The final part of the code is where the actual object detection happens.
You copy an example image like boat.jpg into the assets folder and then call tools/demo.py in image mode, telling YOLOX which model, which weights file, and which input image to use.
Command-line arguments such as --conf 0.3 and --nms 0.65 control the confidence threshold and non-maximum suppression, so the detector only keeps reliable boxes and removes heavy duplicates around the same object.
Once the script finishes, it saves the detection output under YOLOX_outputs/yolox_x, where you can open the image and immediately see how YOLOX has labeled the boat and people in the scene.

Once your YOLOX environment is ready, it’s a great time to explore alternative frameworks too. In my easy Detectron2 object detection tutorial you’ll see how to train and run object detectors using Facebook’s Detectron2 library with PyTorch.

Setting up your YOLOX environment on WSL

In this first part we prepare a clean Conda environment dedicated to YOLOX.
Using a separate environment keeps your dependencies organized and avoids version conflicts with other projects.
Because YOLOX is typically run on Linux, this tutorial assumes you are using WSL on Windows and running the following commands inside a WSL terminal.

Open PowerShell or Windows Command Prompt as (administrator !!! )mode by right-clicking and
selecting “Run as administrator”

### Create a new Conda environment named YoloX with Python 3.7 installed. conda create -n YoloX python=3.7  ### Activate the YoloX environment so that all future installations go into this isolated setup. conda activate YoloX

This short setup ensures you have a reproducible base for all YOLOX experiments.
Any libraries you install next will stay inside this YoloX environment.

Cloning the YOLOX repository into your projects folder

Now that the environment is ready, the next step is to bring the YOLOX source code onto your machine.
You can choose any parent folder for your projects, for example a CoolPython folder under your home directory.
Inside that folder you will clone the official YOLOX repository and move into it so you can run all provided tools and scripts.

### Choose a folder on your machine where you keep Python and computer vision projects (for example a CoolPython folder).  ### Clone the official YOLOX repository from GitHub into the current directory. git clone https://github.com/Megvii-BaseDetection/YOLOX/  ### Change into the newly cloned YOLOX project directory so you can work with its tools and scripts. cd YOLOX

At this point your local folder contains the YOLOX codebase, example tools, documentation, and configuration files.
All remaining commands in this tutorial assume you are inside this YOLOX directory.

Installing YOLOX Python dependencies

With the code in place, YOLOX now needs its Python dependencies such as PyTorch, OpenCV, and other utility libraries.
The project provides a requirements.txt file so you can install everything in one go.
You also install YOLOX itself in editable mode so that local changes are immediately available when you import or run it.

### Upgrade pip to the latest version and then install all required Python packages from requirements.txt. pip3 install -U pip && pip3 install -r requirements.txt  ### Install YOLOX in editable mode so changes in the source code are reflected immediately. pip3 install -v -e .  # or  python3 setup.py develop  ### Install Weights & Biases (wandb) for optional experiment tracking and logging. pip3 install wandb

After this step, your YoloX Conda environment has all the libraries needed to run the YOLOX demos.
If you open Python and import YOLOX modules, they should resolve without errors.

Downloading YOLOX-X COCO model weights

YOLOX offers a model zoo with several sizes, from tiny models for edge devices to large models for maximum accuracy.
In this tutorial we use YOLOX-X, which is the heaviest and most accurate standard model, trained on the COCO dataset with 80 object classes.
To use it, you first create a local weights folder and then download the pretrained weights file into it.

### Create a new folder called weights to store YOLOX pretrained model files. mkdir weights  ### Download the YOLOX-X pretrained weights file (trained on the COCO dataset) into the weights folder. wget -P weights https://github.com/Megvii-BaseDetection/YOLOX/releases/download/0.1.1rc0/yolox_x.pth

Once the download completes, you have a ready-to-use YOLOX-X checkpoint.
This file contains all the learned parameters needed for high-quality object detection on COCO classes.

YOLOX relies on COCO-style datasets and labels, just like many other detectors. For another practical example that mixes classification and detection on COCO classes, check out Image Classification & Object Detection with TensorFlow and PixelLib .

More info :

The YoloX model is Already trained on the Coco Dataset : 80 objects
Here are the class names of COCO : https://gist.github.com/AruniRC/7b3dadd004da04c80198557db5da4bda

Sandard Models :

List of of Yolox models :

YOLOX-s -> best performence
YOLOX-m
YOLOX-l
YOLOX-x -> Best preformence !!! , but with a lot of parameters (time !!)
YOLOX-Darknet53

Light Models :
YOLOX-Nano
YOLOX-Tiny

Running a simple YOLOX object detection demo

Now it is time to see YOLOX in action.
You will run a simple inference script on a single test image and ask the model to detect objects like boats and people.
Make sure you have an image such as boat.jpg copied into the assets folder inside the YOLOX project before running the command.

Here is a sample image :

Boat test image

### Run the YOLOX demo script in image mode using the YOLOX-X model and the downloaded COCO weights on the boat.jpg test image. ### --conf 0.3 keeps only detections with confidence above 30 percent. ### --nms 0.65 applies non-maximum suppression so overlapping boxes for the same object are merged. ### --tsize 640 sets the testing image size to 640x640, balancing speed and accuracy. ### --save_result tells YOLOX to save the output image with drawn bounding boxes. ### --device gpu runs the model on your GPU for faster inference if available. python tools/demo.py image -n yolox-x -c weights/yolox_x.pth --path assets/boat.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result --device gpu

After the script finishes, look inside the YOLOX_outputs/yolox_x folder.
You should see a new image where YOLOX has drawn bounding boxes and labels around the detected boat and persons.
This confirms that your YOLOX installation, COCO weights, and demo pipeline are all working correctly.

FAQ

What is YOLOX in object detection?

YOLOX is an anchor-free version of the YOLO object detector that uses a decoupled head and improved label assignment to achieve strong accuracy with a simpler design, especially on COCO-style datasets.

Why does this tutorial use the YOLOX-X model?

YOLOX-X is the largest standard model in the YOLOX family and provides the best accuracy on COCO, making it ideal for demonstrating high-quality detections in a single-image demo.

Do I need a GPU to run the YOLOX demo?

A GPU is recommended for YOLOX-X because it speeds up inference significantly, but you can still run the demo on CPU if you are willing to accept slower processing times.

What does the confidence threshold in YOLOX control?

The confidence threshold defines the minimum score a detection must have to be kept, so increasing it reduces false positives while lowering it reveals more objects at the cost of extra noise.

Why is non-maximum suppression important in YOLOX?

Non-maximum suppression merges highly overlapping boxes so each object is represented by a single, best bounding box, which makes YOLOX outputs cleaner and easier to interpret.

What dataset are the YOLOX pretrained weights based on?

The pretrained YOLOX weights used in this tutorial are trained on the COCO dataset, which includes 80 everyday object categories such as person, car, dog, and boat.

Can I replace boat.jpg with my own image in this YOLOX demo?

Yes, you can change the –path argument to point to any image you like, and YOLOX will run detection on that file and save the annotated result in the YOLOX_outputs folder.

Is YOLOX suitable for production object detection systems?

YOLOX is designed for real-world use and supports deployment through ONNX, TensorRT, and other runtimes, making it a strong candidate for production object detection pipelines.

Can I train YOLOX on my own custom dataset later?

You can train or fine-tune YOLOX on your own dataset by preparing COCO-style annotations and following the official training scripts provided in the repository.

How does this YOLOX tutorial fit into a bigger learning path?

This YOLOX tutorial gives you a first working detector using pretrained weights, and you can build on it by exploring other posts on SSD MobileNet v3, Detectron2, and transformer-based detectors for more advanced projects.

Conclusion

In this post you built a complete YOLOX object detection pipeline starting from a fresh Conda environment.
You prepared a dedicated YoloX environment, cloned the official YOLOX repository, installed all required dependencies, and downloaded the YOLOX-X pretrained weights trained on the COCO dataset.
With just a single demo command you were able to run YOLOX on a test image and visually inspect the bounding boxes and labels it produced.

Along the way you saw how the different command-line arguments fit together.
The confidence threshold and non-maximum suppression settings helped clean up noisy detections, while the test image size and GPU flag gave you control over the speed–accuracy trade-off.
These same ideas apply when you move from a single static image to video streams, webcams, or more advanced real-time applications.

Most importantly, you now have a clear mental model of how a YOLOX project is structured.
There is an environment, a repository, installed dependencies, pretrained weights, and a small number of scripts that tie everything together.
Once this foundation feels comfortable, you can start customizing labels, training on your own datasets, or integrating YOLOX with other tools like tracking, analytics, or segmentation models.

From here, you can explore the other object detection tutorials on your site to compare YOLOX with SSD MobileNet v3, Detectron2, or DETR and choose the right tool for each project.
Whichever path you take, this simple YOLOX setup gives you a reliable starting point for building, testing, and deploying powerful object detection systems in Python.

If you’re interested in transformer-based detectors after YOLOX, you can read How to use DETR for smart bone fracture detection to see how detection transformers compare to classic CNN-based models in a real medical use case.

Connect :

☕ Buy me a coffee — https://ko-fi.com/eranfeit

🖥️ Email : feitgemel@gmail.com

🌐 https://eranfeit.net

🤝 Fiverr : https://www.fiverr.com/s/mB3Pbb

Enjoy,

Eran

Leave a Comment Cancel Reply