CropAndPad

Targets:

image

mask

bboxes

keypoints

volume

mask3d

Image Types:uint8, float32

Crop and pad images by pixel amounts or fractions of image sizes.

This transform allows for simultaneous cropping and padding of images. Cropping removes pixels from the sides (i.e., extracts a subimage), while padding adds pixels to the sides (e.g., black pixels). The amount of cropping/padding can be specified either in absolute pixels or as a fraction of the image size.

Arguments

px

The number of pixels to crop (negative values) or pad (positive values) on each side of the image. Either this or the parameter percent may be set, not both at the same time.

If int: crop/pad all sides by this value.
If tuple of 2 ints: crop/pad by (top/bottom, left/right).
If tuple of 4 ints: crop/pad by (top, right, bottom, left).
Each int can also be a tuple of 2 ints for a range. Default: None.

percent

The fraction of the image size to crop (negative values) or pad (positive values) on each side. Either this or the parameter px may be set, not both at the same time.

If float: crop/pad all sides by this fraction.
If tuple of 2 floats: crop/pad by (top/bottom, left/right) fractions.
If tuple of 4 floats: crop/pad by (top, right, bottom, left) fractions.
Each float can also be a tuple of 2 floats for a range. Default: None.

border_mode

0 | 1 | 2 | 3 | 4

OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.

fill

tuple[float, ...] | float

The constant value to use for padding if border_mode is cv2.BORDER_CONSTANT. Default: 0.

fill_mask

tuple[float, ...] | float

Same as fill but used for mask padding. Default: 0.

keep_size

bool

true

If True, the output image will be resized to the input image size after cropping/padding. Default: True.

sample_independently

bool

true

If True and ranges are used for px/percent, sample a value for each side independently. If False, sample one value and use it for all sides. Default: True.

interpolation

0 | 6 | 1 | 2 | 3 | 4 | 5

OpenCV interpolation flag used for resizing if keep_size is True. Default: cv2.INTER_LINEAR.

mask_interpolation

0 | 6 | 1 | 2 | 3 | 4 | 5

OpenCV interpolation flag used for resizing if keep_size is True. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.

p

float

Probability of applying the transform. Default: 1.0.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Prepare sample data
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32)
>>> bbox_labels = [1, 2]
>>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32)
>>> keypoint_labels = [0, 1]
>>>
>>> # Example 1: Using px parameter with specific values for each side
>>> # Crop 10px from top, pad 20px on right, pad 30px on bottom, crop 40px from left
>>> transform_px = A.Compose([
...     A.CropAndPad(
...         px=(-10, 20, 30, -40),  # (top, right, bottom, left)
...         border_mode=cv2.BORDER_CONSTANT,
...         fill=128,  # Gray padding color
...         fill_mask=0,
...         keep_size=False,  # Don't resize back to original dimensions
...         p=1.0
...     ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform
>>> result_px = transform_px(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed data with px parameters
>>> transformed_image_px = result_px['image']  # Shape will be different from original
>>> transformed_mask_px = result_px['mask']
>>> transformed_bboxes_px = result_px['bboxes']  # Adjusted to new dimensions
>>> transformed_bbox_labels_px = result_px['bbox_labels']  # Bounding box labels after crop
>>> transformed_keypoints_px = result_px['keypoints']  # Adjusted to new dimensions
>>> transformed_keypoint_labels_px = result_px['keypoint_labels']  # Keypoint labels after crop
>>>
>>> # Example 2: Using percent parameter as a single value
>>> # This will pad all sides by 10% of image dimensions
>>> transform_percent = A.Compose([
...     A.CropAndPad(
...         percent=0.1,  # Pad all sides by 10%
...         border_mode=cv2.BORDER_REFLECT,  # Use reflection padding
...         keep_size=True,  # Resize back to original dimensions
...         p=1.0
...     ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform
>>> result_percent = transform_percent(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed data with percent parameters
>>> # Since keep_size=True, image dimensions remain the same (100x100)
>>> transformed_image_pct = result_percent['image']
>>> transformed_mask_pct = result_percent['mask']
>>> transformed_bboxes_pct = result_percent['bboxes']
>>> transformed_bbox_labels_pct = result_percent['bbox_labels']
>>> transformed_keypoints_pct = result_percent['keypoints']
>>> transformed_keypoint_labels_pct = result_percent['keypoint_labels']
>>>
>>> # Example 3: Random padding within a range
>>> # Pad top and bottom by 5-15%, left and right by 10-20%
>>> transform_random = A.Compose([
...     A.CropAndPad(
...         percent=[(0.05, 0.15), (0.1, 0.2), (0.05, 0.15), (0.1, 0.2)],  # (top, right, bottom, left)
...         sample_independently=True,  # Sample each side independently
...         border_mode=cv2.BORDER_CONSTANT,
...         fill=0,  # Black padding
...         keep_size=False,
...         p=1.0
...     ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Result dimensions will vary based on the random padding values chosen

Notes

This transform will never crop images below a height or width of 1.
When using pixel values (px), the image will be cropped/padded by exactly that many pixels.
When using percentages (percent), the amount of crop/pad will be calculated based on the image size.
Bounding boxes that end up fully outside the image after cropping will be removed.
Keypoints that end up outside the image after cropping will be removed.