← Back to all transforms

CropAndPad

Description

Crop and pad images by pixel amounts or fractions of image sizes.

    This transform allows for simultaneous cropping and padding of images. Cropping removes pixels from the sides
    (i.e., extracts a subimage), while padding adds pixels to the sides (e.g., black pixels). The amount of
    cropping/padding can be specified either in absolute pixels or as a fraction of the image size.

    Args:
        px (int, tuple of int, tuple of tuples of int, or None):
            The number of pixels to crop (negative values) or pad (positive values) on each side of the image.
            Either this or the parameter `percent` may be set, not both at the same time.
            - If int: crop/pad all sides by this value.
            - If tuple of 2 ints: crop/pad by (top/bottom, left/right).
            - If tuple of 4 ints: crop/pad by (top, right, bottom, left).
            - Each int can also be a tuple of 2 ints for a range, or a list of ints for discrete choices.
            Default: None.

        percent (float, tuple of float, tuple of tuples of float, or None):
            The fraction of the image size to crop (negative values) or pad (positive values) on each side.
            Either this or the parameter `px` may be set, not both at the same time.
            - If float: crop/pad all sides by this fraction.
            - If tuple of 2 floats: crop/pad by (top/bottom, left/right) fractions.
            - If tuple of 4 floats: crop/pad by (top, right, bottom, left) fractions.
            - Each float can also be a tuple of 2 floats for a range, or a list of floats for discrete choices.
            Default: None.

        pad_mode (int):
            OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.

        pad_cval (number, tuple of number, or list of number):
            The constant value to use for padding if pad_mode is cv2.BORDER_CONSTANT.
            - If number: use this value for all channels.
            - If tuple of 2 numbers: use uniform random value between these numbers.
            - If list of numbers: use random choice from this list.
            Default: 0.

        pad_cval_mask (number, tuple of number, or list of number):
            Same as pad_cval but used for mask padding. Default: 0.

        keep_size (bool):
            If True, the output image will be resized to the input image size after cropping/padding.
            Default: True.

        sample_independently (bool):
            If True and ranges are used for px/percent, sample a value for each side independently.
            If False, sample one value and use it for all sides. Default: True.

        interpolation (int):
            OpenCV interpolation flag used for resizing if keep_size is True.
            Default: cv2.INTER_LINEAR.

        mask_interpolation (int):
            OpenCV interpolation flag used for resizing if keep_size is True.
            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.
            Default: cv2.INTER_NEAREST.

        p (float):
            Probability of applying the transform. Default: 1.0.

    Targets:
        image, mask, bboxes, keypoints

    Image types:
        uint8, float32

    Note:
        - This transform will never crop images below a height or width of 1.
        - When using pixel values (px), the image will be cropped/padded by exactly that many pixels.
        - When using percentages (percent), the amount of crop/pad will be calculated based on the image size.
        - Bounding boxes that end up fully outside the image after cropping will be removed.
        - Keypoints that end up outside the image after cropping will be removed.

    Example:
        >>> import albumentations as A
        >>> transform = A.Compose([
        ...     A.CropAndPad(px=(-10, 20, 30, -40), pad_mode=cv2.BORDER_REFLECT, p=1.0),
        ... ])
        >>> transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)
        >>> transformed_image = transformed['image']
        >>> transformed_mask = transformed['mask']
        >>> transformed_bboxes = transformed['bboxes']
        >>> transformed_keypoints = transformed['keypoints']
    

Parameters

  • px: int | tuple[int, int] | tuple[int, int, int, int] | tuple[int | tuple[int, int] | list[int], int | tuple[int, int] | list[int], int | tuple[int, int] | list[int], int | tuple[int, int] | list[int]] | None (default: null)
  • percent: float | tuple[float, float] | tuple[float, float, float, float] | tuple[float | tuple[float, float] | list[float], float | tuple[float, float] | list[float], float | tuple[float, float] | list[float], float | tuple[float, float] | list[float]] | None (default: null)
  • pad_mode: Literal['cv2.BORDER_CONSTANT', 'cv2.BORDER_REPLICATE', 'cv2.BORDER_REFLECT', 'cv2.BORDER_WRAP', 'cv2.BORDER_DEFAULT', 'cv2.BORDER_TRANSPARENT'] (default: 0)
  • pad_cval: int | float | tuple[int | float, int | float] | list[int | float] (default: 0)
  • pad_cval_mask: int | float | tuple[int | float, int | float] | list[int | float] (default: 0)
  • keep_size: bool (default: true)
  • sample_independently: bool (default: true)
  • interpolation: Literal['cv2.INTER_NEAREST', 'cv2.INTER_LINEAR', 'cv2.INTER_CUBIC', 'cv2.INTER_AREA', 'cv2.INTER_LANCZOS4', 'cv2.INTER_BITS', 'cv2.INTER_NEAREST_EXACT', 'cv2.INTER_MAX'] (default: 1)
  • mask_interpolation: Literal['cv2.INTER_NEAREST', 'cv2.INTER_LINEAR', 'cv2.INTER_CUBIC', 'cv2.INTER_AREA', 'cv2.INTER_LANCZOS4', 'cv2.INTER_BITS', 'cv2.INTER_NEAREST_EXACT', 'cv2.INTER_MAX'] (default: 0)
  • p: float (default: 1)

Targets

  • Image
  • Mask
  • BBoxes
  • Keypoints

Try it out

Original Image:

Original Image: (733, 484, 3)

Original Image

Bbox Params

Keypoint Params

Mask: (733, 484, 3)

Mask

BBoxes: (733, 484, 3)

BBoxes

Keypoints: (733, 484, 3)

Keypoints