LetterBox

Targets:
image
mask
bboxes
keypoints
Image Types:uint8, float32

Scale image to fit a target canvas preserving aspect ratio, then pad to exact canvas size: YOLO letterbox, equivalent to LongestMaxSize + PadIfNeeded.

The image is downscaled or upscaled so its longest side fits the target, then constant-color padding fills the remaining area. All targets (masks, bboxes, keypoints) are adjusted accordingly.

Arguments
size
tuple[int, int]

Target (height, width) of the output canvas.

interpolation
0 | 6 | 1 | 2 | 3 | 4 | 5
1

Interpolation method used when resizing the image. Default: cv2.INTER_LINEAR.

mask_interpolation
0 | 6 | 1 | 2 | 3 | 4 | 5
0

Interpolation method used when resizing masks. Default: cv2.INTER_NEAREST.

fill
tuple[float, ...] | float
114

Constant pixel value for image padding. Default: 114.

fill_mask
tuple[float, ...] | float
0

Constant pixel value for mask padding. Default: 0.

position
center | top_left | top_right | bottom_left | bottom_right | random
center

Where to place the resized image on the canvas. Default: "center".

p
float
1

Probability of applying the transform. Default: 1.0.

Examples
>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>> image = np.random.randint(0, 256, (480, 640, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (480, 640), dtype=np.uint8)
>>> bboxes = np.array([[100, 80, 300, 200]], dtype=np.float32)
>>> bbox_labels = [1]
>>> keypoints = np.array([[200, 150]], dtype=np.float32)
>>> keypoint_labels = [0]
>>>
>>> transform = A.Compose([
...     A.LetterBox(size=(640, 640), fill=114, fill_mask=0, p=1.0)
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> result = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels,
... )
>>> result['image'].shape
(640, 640, 3)
Notes
  • The output size is always exactly (height, width).
  • Images smaller than the target are upscaled; images larger are downscaled.
  • Bounding boxes and keypoints are adjusted for both the resize and padding steps.
  • fill=114 is the YOLO convention for letterbox padding.