PadIfNeeded

Targets:

image

mask

bboxes

keypoints

volume

mask3d

Image Types:uint8, float32

Pads the sides of an image if the image dimensions are less than the specified minimum dimensions. If the pad_height_divisor or pad_width_divisor is specified, the function additionally ensures that the image dimensions are divisible by these values.

Arguments

min_height

int | None

1024

Minimum desired height of the image. Ensures image height is at least this value. If not specified, pad_height_divisor must be provided.

min_width

int | None

1024

Minimum desired width of the image. Ensures image width is at least this value. If not specified, pad_width_divisor must be provided.

pad_height_divisor

int | None

If set, pads the image height to make it divisible by this value. If not specified, min_height must be provided.

pad_width_divisor

int | None

If set, pads the image width to make it divisible by this value. If not specified, min_width must be provided.

position

center

Position where the image is to be placed after padding. Default is 'center'.

border_mode

0 | 1 | 2 | 3 | 4

Specifies the border mode to use if padding is required. The default is cv2.BORDER_CONSTANT.

fill

tuple[float, ...] | float

Value to fill the border pixels if the border mode is cv2.BORDER_CONSTANT. Default is None.

fill_mask

tuple[float, ...] | float

Similar to fill but used for padding masks. Default is None.

p

float

Probability of applying the transform. Default is 1.0.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Prepare sample data
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32)
>>> bbox_labels = [1, 2]
>>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32)
>>> keypoint_labels = [0, 1]
>>>
>>> # Example 1: Basic usage with min_height and min_width
>>> transform = A.Compose([
...     A.PadIfNeeded(min_height=150, min_width=200, border_mode=cv2.BORDER_CONSTANT, fill=0),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform
>>> padded = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the padded data
>>> padded_image = padded['image']  # Shape will be (150, 200, 3)
>>> padded_mask = padded['mask']    # Shape will be (150, 200)
>>> padded_bboxes = padded['bboxes']  # Bounding boxes adjusted for the padded image
>>> padded_bbox_labels = padded['bbox_labels']  # Labels remain unchanged
>>> padded_keypoints = padded['keypoints']  # Keypoints adjusted for the padded image
>>> padded_keypoint_labels = padded['keypoint_labels']  # Labels remain unchanged
>>>
>>> # Example 2: Using pad_height_divisor and pad_width_divisor
>>> # This ensures the output dimensions are divisible by the specified values
>>> transform_divisor = A.Compose([
...     A.PadIfNeeded(
...         pad_height_divisor=32,
...         pad_width_divisor=32,
...         border_mode=cv2.BORDER_CONSTANT,
...         fill=0
...     ),
... ])
>>>
>>> padded_divisor = transform_divisor(image=image)
>>> padded_divisor_image = padded_divisor['image']  # Shape will be (128, 128, 3) - divisible by 32
>>>
>>> # Example 3: Different position options
>>> # Create a small recognizable image for better visualization of positioning
>>> small_image = np.zeros((50, 50, 3), dtype=np.uint8)
>>> small_image[20:30, 20:30, :] = 255  # White square in the middle
>>>
>>> # Top-left positioning
>>> top_left_pad = A.Compose([
...     A.PadIfNeeded(
...         min_height=100,
...         min_width=100,
...         position="top_left",
...         border_mode=cv2.BORDER_CONSTANT,
...         fill=128  # Gray padding
...     ),
... ])
>>> top_left_result = top_left_pad(image=small_image)
>>> top_left_image = top_left_result['image']  # Image will be at top-left of 100x100 canvas
>>>
>>> # Center positioning (default)
>>> center_pad = A.Compose([
...     A.PadIfNeeded(
...         min_height=100,
...         min_width=100,
...         position="center",
...         border_mode=cv2.BORDER_CONSTANT,
...         fill=128
...     ),
... ])
>>> center_result = center_pad(image=small_image)
>>> center_image = center_result['image']  # Image will be centered in 100x100 canvas
>>>
>>> # Example 4: Different border_mode options
>>> # Reflection padding
>>> reflect_pad = A.Compose([
...     A.PadIfNeeded(
...         min_height=100,
...         min_width=100,
...         border_mode=cv2.BORDER_REFLECT_101
...     ),
... ])
>>> reflected = reflect_pad(image=small_image)
>>> reflected_image = reflected['image']  # Will use reflection for padding
>>>
>>> # Replication padding
>>> replicate_pad = A.Compose([
...     A.PadIfNeeded(
...         min_height=100,
...         min_width=100,
...         border_mode=cv2.BORDER_REPLICATE
...     ),
... ])
>>> replicated = replicate_pad(image=small_image)
>>> replicated_image = replicated['image']  # Will use edge replication for padding
>>>
>>> # Example 5: Working with masks and custom fill values
>>> binary_mask = np.zeros((50, 50), dtype=np.uint8)
>>> binary_mask[10:40, 10:40] = 1  # Set center region to 1
>>>
>>> mask_transform = A.Compose([
...     A.PadIfNeeded(
...         min_height=100,
...         min_width=100,
...         border_mode=cv2.BORDER_CONSTANT,
...         fill=0,          # Black padding for image
...         fill_mask=0      # Use 0 for mask padding (background)
...     ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']))
>>>
>>> padded_mask_result = mask_transform(
...     image=image,
...     mask=binary_mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels
... )
>>> padded_binary_mask = padded_mask_result['mask']  # Shape will be (100, 100)
>>> padded_result_bboxes = padded_mask_result['bboxes']  # Adjusted for padding
>>> padded_result_bbox_labels = padded_mask_result['bbox_labels']  # Labels remain unchanged

Notes

Either min_height or pad_height_divisor must be set, but not both.
Either min_width or pad_width_divisor must be set, but not both.
If border_mode is set to cv2.BORDER_CONSTANT, value must be provided.
The transform will maintain consistency across all targets (image, mask, bboxes, keypoints, volume).
For bounding boxes, the coordinates will be adjusted to account for the padding.
For keypoints, their positions will be shifted according to the padding.