Skip to content

Data Augmentation

See HOWTO for a guide on adding custom augmentation transforms.

Built-in Transforms

All transforms live in yolo/data/augmentation.py and follow the same interface: __call__(image, boxes) -> (image, boxes).

Class Default prob Description
PadAndResize Letterbox-pads and resizes image to a target size while adjusting boxes
RemoveOutliers Drops bounding boxes whose area falls below a minimum threshold
HorizontalFlip 0.5 Randomly flips image and boxes horizontally
VerticalFlip 0.5 Randomly flips image and boxes vertically
Mosaic 0.5 Stitches four dataset images into one; boxes are adjusted accordingly
MixUp 0.5 Alpha-blends two images and merges their box lists
RandomCrop 0.5 Crops image to half its size; clips boxes to the new boundary

AugmentationComposer chains these transforms together and handles image-size updates for transforms like PadAndResize that need the target size at runtime:

from yolo.data.augmentation import AugmentationComposer, HorizontalFlip, Mosaic, PadAndResize

transforms = AugmentationComposer(
    [Mosaic(), HorizontalFlip(), PadAndResize(image_size)],
    image_size=640,
)
image, boxes = transforms(image, boxes)

Writing a Custom Transform

A transform is any callable that accepts (image: PIL.Image, boxes: Tensor) and returns (image, boxes):

class MyTransform:
    def __init__(self, prob: float = 0.5):
        self.prob = prob

    def __call__(self, image, boxes):
        import random
        if random.random() > self.prob:
            return image, boxes
        # ... apply transform ...
        return image, boxes

Pass it to AugmentationComposer alongside the built-in transforms:

transforms = AugmentationComposer([MyTransform(0.3), HorizontalFlip()], image_size=640)

To override the augmentation pipeline for training, pass a custom AugmentationComposer to create_dataloader.