PadIfNeeded

Targets:
image
mask
bboxes
keypoints
volume
mask3d
Image Types:uint8, float32

Pads the sides of an image if the image dimensions are less than the specified minimum dimensions. If the pad_height_divisor or pad_width_divisor is specified, the function additionally ensures that the image dimensions are divisible by these values.

Arguments
min_height
int | None
1024

Minimum desired height of the image. Ensures image height is at least this value. If not specified, pad_height_divisor must be provided.

min_width
int | None
1024

Minimum desired width of the image. Ensures image width is at least this value. If not specified, pad_width_divisor must be provided.

pad_height_divisor
int | None

If set, pads the image height to make it divisible by this value. If not specified, min_height must be provided.

pad_width_divisor
int | None

If set, pads the image width to make it divisible by this value. If not specified, min_width must be provided.

position
center | top_left | top_right | bottom_left | bottom_right | random
center

Position where the image is to be placed after padding. Default is 'center'.

border_mode
0 | 1 | 2 | 3 | 4
0

Specifies the border mode to use if padding is required. The default is cv2.BORDER_CONSTANT.

fill
tuple[float, ...] | float
0

Value to fill the border pixels if the border mode is cv2.BORDER_CONSTANT. Default is None.

fill_mask
tuple[float, ...] | float
0

Similar to fill but used for padding masks. Default is None.

p
float
1

Probability of applying the transform. Default is 1.0.

Examples
>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Prepare sample data
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32)
>>> bbox_labels = [1, 2]
>>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32)
>>> keypoint_labels = [0, 1]
>>>
>>> # Example 1: Basic usage with min_height and min_width
>>> transform = A.Compose([
...     A.PadIfNeeded(min_height=150, min_width=200, border_mode=cv2.BORDER_CONSTANT, fill=0),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform
>>> padded = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the padded data
>>> padded_image = padded['image']  # Shape will be (150, 200, 3)
>>> padded_mask = padded['mask']    # Shape will be (150, 200)
>>> padded_bboxes = padded['bboxes']  # Bounding boxes adjusted for the padded image
>>> padded_bbox_labels = padded['bbox_labels']  # Labels remain unchanged
>>> padded_keypoints = padded['keypoints']  # Keypoints adjusted for the padded image
>>> padded_keypoint_labels = padded['keypoint_labels']  # Labels remain unchanged
>>>
>>> # Example 2: Using pad_height_divisor and pad_width_divisor
>>> # This ensures the output dimensions are divisible by the specified values
>>> transform_divisor = A.Compose([
...     A.PadIfNeeded(
...         pad_height_divisor=32,
...         pad_width_divisor=32,
...         border_mode=cv2.BORDER_CONSTANT,
...         fill=0
...     ),
... ])
>>>
>>> padded_divisor = transform_divisor(image=image)
>>> padded_divisor_image = padded_divisor['image']  # Shape will be (128, 128, 3) - divisible by 32
>>>
>>> # Example 3: Different position options
>>> # Create a small recognizable image for better visualization of positioning
>>> small_image = np.zeros((50, 50, 3), dtype=np.uint8)
>>> small_image[20:30, 20:30, :] = 255  # White square in the middle
>>>
>>> # Top-left positioning
>>> top_left_pad = A.Compose([
...     A.PadIfNeeded(
...         min_height=100,
...         min_width=100,
...         position="top_left",
...         border_mode=cv2.BORDER_CONSTANT,
...         fill=128  # Gray padding
...     ),
... ])
>>> top_left_result = top_left_pad(image=small_image)
>>> top_left_image = top_left_result['image']  # Image will be at top-left of 100x100 canvas
>>>
>>> # Center positioning (default)
>>> center_pad = A.Compose([
...     A.PadIfNeeded(
...         min_height=100,
...         min_width=100,
...         position="center",
...         border_mode=cv2.BORDER_CONSTANT,
...         fill=128
...     ),
... ])
>>> center_result = center_pad(image=small_image)
>>> center_image = center_result['image']  # Image will be centered in 100x100 canvas
>>>
>>> # Example 4: Different border_mode options
>>> # Reflection padding
>>> reflect_pad = A.Compose([
...     A.PadIfNeeded(
...         min_height=100,
...         min_width=100,
...         border_mode=cv2.BORDER_REFLECT_101
...     ),
... ])
>>> reflected = reflect_pad(image=small_image)
>>> reflected_image = reflected['image']  # Will use reflection for padding
>>>
>>> # Replication padding
>>> replicate_pad = A.Compose([
...     A.PadIfNeeded(
...         min_height=100,
...         min_width=100,
...         border_mode=cv2.BORDER_REPLICATE
...     ),
... ])
>>> replicated = replicate_pad(image=small_image)
>>> replicated_image = replicated['image']  # Will use edge replication for padding
>>>
>>> # Example 5: Working with masks and custom fill values
>>> binary_mask = np.zeros((50, 50), dtype=np.uint8)
>>> binary_mask[10:40, 10:40] = 1  # Set center region to 1
>>>
>>> mask_transform = A.Compose([
...     A.PadIfNeeded(
...         min_height=100,
...         min_width=100,
...         border_mode=cv2.BORDER_CONSTANT,
...         fill=0,          # Black padding for image
...         fill_mask=0      # Use 0 for mask padding (background)
...     ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']))
>>>
>>> padded_mask_result = mask_transform(
...     image=image,
...     mask=binary_mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels
... )
>>> padded_binary_mask = padded_mask_result['mask']  # Shape will be (100, 100)
>>> padded_result_bboxes = padded_mask_result['bboxes']  # Adjusted for padding
>>> padded_result_bbox_labels = padded_mask_result['bbox_labels']  # Labels remain unchanged
Notes
  • Either min_height or pad_height_divisor must be set, but not both.
  • Either min_width or pad_width_divisor must be set, but not both.
  • If border_mode is set to cv2.BORDER_CONSTANT, value must be provided.
  • The transform will maintain consistency across all targets (image, mask, bboxes, keypoints, volume).
  • For bounding boxes, the coordinates will be adjusted to account for the padding.
  • For keypoints, their positions will be shifted according to the padding.