LetterBox

Targets:

image

mask

bboxes

keypoints

Image Types:uint8, float32

Scale image to fit a target canvas preserving aspect ratio, then pad to exact canvas size: YOLO letterbox, equivalent to LongestMaxSize + PadIfNeeded.

The image is downscaled or upscaled so its longest side fits the target, then constant-color padding fills the remaining area. All targets (masks, bboxes, keypoints) are adjusted accordingly.

Arguments

size

tuple[int, int]

Target (height, width) of the output canvas.

interpolation

0 | 6 | 1 | 2 | 3 | 4 | 5

Interpolation method used when resizing the image. Default: cv2.INTER_LINEAR.

mask_interpolation

0 | 6 | 1 | 2 | 3 | 4 | 5

Interpolation method used when resizing masks. Default: cv2.INTER_NEAREST.

fill

tuple[float, ...] | float

114

Constant pixel value for image padding. Default: 114.

fill_mask

tuple[float, ...] | float

Constant pixel value for mask padding. Default: 0.

position

center

Where to place the resized image on the canvas. Default: "center".

p

float

Probability of applying the transform. Default: 1.0.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>> image = np.random.randint(0, 256, (480, 640, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (480, 640), dtype=np.uint8)
>>> bboxes = np.array([[100, 80, 300, 200]], dtype=np.float32)
>>> bbox_labels = [1]
>>> keypoints = np.array([[200, 150]], dtype=np.float32)
>>> keypoint_labels = [0]
>>>
>>> transform = A.Compose([
...     A.LetterBox(size=(640, 640), fill=114, fill_mask=0, p=1.0)
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> result = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels,
... )
>>> result['image'].shape
(640, 640, 3)

Notes

The output size is always exactly (height, width).
Images smaller than the target are upscaled; images larger are downscaled.
Bounding boxes and keypoints are adjusted for both the resize and padding steps.
fill=114 is the YOLO convention for letterbox padding.

>>> import numpy as np >>> import albumentations as A >>> import cv2 >>> image = np.random.randint(0, 256, (480, 640, 3), dtype=np.uint8) >>> mask = np.random.randint(0, 2, (480, 640), dtype=np.uint8) >>> bboxes = np.array([[100, 80, 300, 200]], dtype=np.float32) >>> bbox_labels = [1] >>> keypoints = np.array([[200, 150]], dtype=np.float32) >>> keypoint_labels = [0] >>> >>> transform = A.Compose([ ... A.LetterBox(size=(640, 640), fill=114, fill_mask=0, p=1.0) ... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']), ... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels'])) >>> >>> result = transform( ... image=image, ... mask=mask, ... bboxes=bboxes, ... bbox_labels=bbox_labels, ... keypoints=keypoints, ... keypoint_labels=keypoint_labels, ... ) >>> result['image'].shape (640, 640, 3)