In this tutorial, we will show you how to use LightlyTrain to train a model on your own dataset for image classification.
Self-Supervised Learning (SSL) is reshaping computer vision, just like LLMs reshaped text. The newly launched LightlyTrain framework empowers AI teams—no PhD required—to easily train robust, unbiased foundation models on their own datasets.
Let’s dive into how SSL with LightlyTrain beats traditional methods Imagine training better computer vision models—without labeling a single image.
That’s exactly what LightlyTrain offers. It brings self-supervised pretraining to your real-world pipelines, using your unlabeled image or video data to kickstart model training.
We will walk through how to load the model, modify it for your dataset, preprocess the images, load the trained weights, and run predictions—including drawing labels on the image using OpenCV.
Check out our tutorial here : https://youtu.be/MHXx2HY29uc
LightlyTrain page: https://www.lightly.ai/lightlytrain?utm_source=youtube&utm_medium=description&utm_campaign=eran
LightlyTrain Github : https://github.com/lightly-ai/lightly-train
LightlyTrain Docs: https://docs.lightly.ai/train/stable/index.html
Lightly Discord: https://discord.gg/xvNJW94
What You’ll Learn :
Part 1: Download and prepare the dataset
Part 2: How to Pre-train your custom dataset
Part 3: How to fine-tune your model with a new dataset / categories
Part 4: Test the model
Link for the code : https://ko-fi.com/s/9be1ebf845
You can find more tutorials, and join my newsletter here : https://eranfeit.net/
Here is the LightlyTrain tutorial code :
Step 1 — Download the datasets :
Follow the video tutorial for preparing the dataset for pre train and fine-tune
In this tutorial we will use 2 datasets :
- Dataset for Pre-training and feature extraction : 9 Categories dataset : https://www.kaggle.com/datasets/muhammadhananasghar/9-dogs-breeds-identification-classification
- Second dataset : https://www.kaggle.com/datasets/gpiosenka/70-dog-breedsimage-data-set
Step 2 — Pre train the first dataset : (9 categories dataset)
import lightly_train lightly_train.train( data="/mnt/d/Data-Sets-Image-Classification/9 dogs Breeds", out="/mnt/d/temp/models/lightly-train/Object-Classification/out/9-dogs-Breed", model="torchvision/resnet50", epochs=100, batch_size=32, )
Link for the full code : https://ko-fi.com/s/9be1ebf845
Step 3 — Fine-Tune-The-Model : (using second dataset — 10 categories)
# Import necessary libraries for deep learning and utilities import torch from torch import nn, optim from torch.utils.data import DataLoader from torchvision import datasets, transforms, models from tqdm import tqdm import os # Select device: GPU if available, otherwise CPU device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # Define image preprocessing pipeline: resize and convert to tensor transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor(), ]) # Path to the dataset of 10 dog breeds dataset_path = "/mnt/d/Data-Sets-Image-Classification/10 dogs Breeds" # Path to the previously trained model weights last_model = "/mnt/d/temp/models/lightly-train/Object-Classification/out/9-dogs-Breed/exported_models/exported_last.pt" # Path to save the best model during fine-tuning best_model_path = "/mnt/d/Temp/Models/lightly-train/Object-Classification/out/10-dogs-breed/fine-tune/best_fine_tuned_resnet50.pth" # Path to save the final model after training final_model_path = "/mnt/d/Temp/Models/lightly-train/Object-Classification/out/10-dogs-breed/fine-tune/final_fine_tuned_resnet50.pth" # Ensure the output directories exist os.makedirs(os.path.dirname(best_model_path), exist_ok=True) os.makedirs(os.path.dirname(final_model_path), exist_ok=True) os.makedirs(os.path.dirname(last_model), exist_ok=True) # Load dataset from directory using ImageFolder and apply transforms dataset = datasets.ImageFolder(root=dataset_path, transform=transform) # Create a dataloader to iterate over the dataset dataloader = DataLoader(dataset, batch_size=32, shuffle=True, drop_last=True) # Load the pre-trained ResNet50 model model = models.resnet50() # Load only the weights (not the full model checkpoint) and map to CPU/GPU model.load_state_dict(torch.load(last_model, weights_only=True, map_location=device)) # Replace the final classification layer to match the number of dog breed classes model.fc = nn.Linear(model.fc.in_features, len(dataset.classes)) # Move the model to the selected device (GPU or CPU) model = model.to(device) # Define loss function (cross-entropy for classification) criterion = nn.CrossEntropyLoss() # Define optimizer (Adam with learning rate 0.001) optimizer = optim.Adam(model.parameters(), lr=0.001) # Print message to indicate training is starting print("Starting fine-tuning...") # Set training parameters num_epochs = 200 # Maximum number of training epochs patience = 10 # Early stopping patience best_loss = float('inf') # Initialize best loss to infinity no_improvement_count = 0 # Counter for early stopping # Begin training loop for epoch in range(num_epochs): epoch_loss = 0.0 # Accumulate loss for the current epoch # Wrap dataloader with tqdm for a progress bar display with tqdm(dataloader, desc=f"Epoch {epoch+1}/{num_epochs}") as pbar: for inputs, labels in pbar: # Move data to device inputs = inputs.to(device) labels = labels.to(device) # Reset gradients optimizer.zero_grad() # Forward pass outputs = model(inputs) # Compute loss loss = criterion(outputs, labels) # Backward pass and optimization loss.backward() optimizer.step() # Accumulate loss epoch_loss += loss.item() # Update tqdm progress bar with current loss pbar.set_postfix(loss=loss.item()) # Compute average loss for the epoch epoch_loss /= len(dataloader) print(f"Epoch [{epoch+1}/{num_epochs}] completed, Average Loss: {epoch_loss:.4f}") # Check if the current model is better (lower loss) than previous best if epoch_loss < best_loss: best_loss = epoch_loss torch.save(model.state_dict(), best_model_path) # Save best model print(f"Best model updated at epoch {epoch+1} with loss {best_loss:.4f}") no_improvement_count = 0 # Reset counter if improvement found else: # If no improvement, increment counter no_improvement_count += 1 print(f"No improvemnet in epoch {epoch+1}. Best loss so far: {best_loss:.4f}") print(f"Epochs without improvement: {no_improvement_count}/{patience}") # Trigger early stopping if no improvement for 'patience' epochs if no_improvement_count >= patience: print(f"Early stopping triggered after {epoch+1} epochs due to no improvement for {patience} consecutive epochs") break # Save the final model at the end of training torch.save(model.state_dict(), final_model_path) print(f"Final model saved to {final_model_path}")
Link for the full code : https://ko-fi.com/s/9be1ebf845
Step 4 — Test the model :
# Import necessary libraries for deep learning, image handling, and visualization import torch from torch import nn, optim from torch.utils.data import DataLoader from torchvision import datasets, transforms, models from tqdm import tqdm import os from PIL import Image, ImageDraw, ImageFont import matplotlib.pyplot as plt # Set the device: use GPU if available, otherwise CPU device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # Define image preprocessing pipeline: resize and convert to tensor transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor(), ]) # Path to the best fine-tuned model weights best_model_path = "/mnt/d/Temp/Models/lightly-train/Object-Classification/out/10-dogs-breed/fine-tune/best_fine_tuned_resnet50.pth" # Path to the original dataset to retrieve class names (folder names) dataset_path = "/mnt/d/Data-Sets-Image-Classification/10 dogs Breeds" # Load dataset structure to extract class names dataset = datasets.ImageFolder(root=dataset_path, transform=transform) class_names = dataset.classes print("Class names:", class_names) # Initialize the ResNet50 model architecture model = models.resnet50() # Replace the final fully-connected layer to match number of dog breeds model.fc = nn.Linear(model.fc.in_features, len(class_names)) # Load trained weights into the model model.load_state_dict(torch.load(best_model_path, map_location=device)) # Move model to the device (GPU or CPU) and set it to evaluation mode model = model.to(device) model.eval() # Path to the test image to classify # Uncomment different lines to test different images # test_image_path = "Best-image-classification-models/lightly-train/Beagle.jpg" # test_image_path = "Best-image-classification-models/lightly-train/Shih_Tzu.jpg" test_image_path = "Best-image-classification-models/lightly-train/Collie.jpg" # Load the test image using PIL and convert it to RGB image = Image.open(test_image_path).convert("RGB") # Resize and transform the image, then add a batch dimension and move to device image_resized = image.resize((224, 224)) image_tensor = transform(image_resized).unsqueeze(0).to(device) # Perform inference with no gradient tracking with torch.no_grad(): output = model(image_tensor) predicted_class = torch.argmax(output, dim=1).item() # Get index of highest score # Get the class name from the predicted index class_name = class_names[predicted_class] print("===================================") print(f"Predicted class: {class_name}") # Draw predicted class name on the original image using PIL draw = ImageDraw.Draw(image) font = ImageFont.load_default() draw.text((10, 10), class_name, fill="white", font=font) # Display the annotated image using matplotlib plt.imshow(image) plt.axis('off') # Hide axis for cleaner output plt.title(f"Predicted class: {class_name}") plt.show()
Link for the full code : https://ko-fi.com/s/9be1ebf845
Connect :
☕ Buy me a coffee — https://ko-fi.com/eranfeit
🖥️ Email : feitgemel@gmail.com
🤝 Fiverr : https://www.fiverr.com/s/mB3Pbb
Enjoy,
Eran