RandomResizedCrop

Targets:

image

mask

bboxes

keypoints

Image Types:uint8, float32

Random crop with scale and ratio ranges (torchvision-style), then resize to size. Standard for training on varying resolutions; scale and ratio control crop.

This transform first crops a random portion of the input image (or mask, bounding boxes, keypoints) and then resizes the crop to a specified size. It's particularly useful for training neural networks on images of varying sizes and aspect ratios.

Arguments

size

tuple[int, int]

Target size for the output image, i.e. (height, width) after crop and resize.

scale

tuple[float, float]

[0.08,1]

Range of the random size of the crop relative to the input size. For example, (0.08, 1.0) means the crop size will be between 8% and 100% of the input size. Default: (0.08, 1.0)

ratio

tuple[float, float]

[0.75,1.3333333333333333]

Range of aspect ratios of the random crop. For example, (0.75, 1.3333) allows crop aspect ratios from 3:4 to 4:3. Default: (0.75, 1.3333333333333333)

interpolation

0 | 6 | 1 | 2 | 3 | 4 | 5

Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR

mask_interpolation

0 | 6 | 1 | 2 | 3 | 4 | 5

Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST

area_for_downscale

image | image_mask |

Controls automatic use of INTER_AREA interpolation for downscaling. Options:

None: No automatic interpolation selection, always use the specified interpolation method
"image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks
"image_mask": Use INTER_AREA when downscaling both images and masks Default: None.

p

float

Probability of applying the transform. Default: 1.0

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Prepare sample data
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32)
>>> bbox_labels = [1, 2]
>>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32)
>>> keypoint_labels = [0, 1]
>>>
>>> # Define transform with parameters as tuples
>>> transform = A.Compose([
...     A.RandomResizedCrop(
...         size=(64, 64),
...         scale=(0.5, 0.9),  # Crop size will be 50-90% of original image
...         ratio=(0.75, 1.33),  # Aspect ratio will vary from 3:4 to 4:3
...         interpolation=cv2.INTER_LINEAR,
...         mask_interpolation=cv2.INTER_NEAREST,
...         area_for_downscale="image",  # Use INTER_AREA for image downscaling
...         p=1.0
...     ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform
>>> transformed = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed data
>>> transformed_image = transformed['image']       # Shape: (64, 64, 3)
>>> transformed_mask = transformed['mask']         # Shape: (64, 64)
>>> transformed_bboxes = transformed['bboxes']     # Bounding boxes adjusted to new crop and size
>>> transformed_bbox_labels = transformed['bbox_labels']  # Labels for the preserved bboxes
>>> transformed_keypoints = transformed['keypoints']      # Keypoints adjusted to new crop and size
>>> transformed_keypoint_labels = transformed['keypoint_labels']  # Labels for the preserved keypoints

Notes

This transform attempts to crop a random area with an aspect ratio and relative size specified by 'ratio' and 'scale' parameters. If it fails to find a suitable crop after 10 attempts, it will return a crop from the center of the image.
The crop's aspect ratio is defined as width / height.
Bounding boxes that end up fully outside the cropped area will be removed.
Keypoints that end up outside the cropped area will be removed.
After cropping, the result is resized to the specified size.
When area_for_downscale is set, INTER_AREA interpolation will be used automatically for downscaling (when the crop is larger than the target size), which provides better quality for size reduction.

>>> import numpy as np >>> import albumentations as A >>> import cv2 >>> >>> # Prepare sample data >>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) >>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8) >>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32) >>> bbox_labels = [1, 2] >>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32) >>> keypoint_labels = [0, 1] >>> >>> # Define transform with parameters as tuples >>> transform = A.Compose([ ... A.RandomResizedCrop( ... size=(64, 64), ... scale=(0.5, 0.9), # Crop size will be 50-90% of original image ... ratio=(0.75, 1.33), # Aspect ratio will vary from 3:4 to 4:3 ... interpolation=cv2.INTER_LINEAR, ... mask_interpolation=cv2.INTER_NEAREST, ... area_for_downscale="image", # Use INTER_AREA for image downscaling ... p=1.0 ... ), ... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']), ... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels'])) >>> >>> # Apply the transform >>> transformed = transform( ... image=image, ... mask=mask, ... bboxes=bboxes, ... bbox_labels=bbox_labels, ... keypoints=keypoints, ... keypoint_labels=keypoint_labels ... ) >>> >>> # Get the transformed data >>> transformed_image = transformed['image'] # Shape: (64, 64, 3) >>> transformed_mask = transformed['mask'] # Shape: (64, 64) >>> transformed_bboxes = transformed['bboxes'] # Bounding boxes adjusted to new crop and size >>> transformed_bbox_labels = transformed['bbox_labels'] # Labels for the preserved bboxes >>> transformed_keypoints = transformed['keypoints'] # Keypoints adjusted to new crop and size >>> transformed_keypoint_labels = transformed['keypoint_labels'] # Labels for the preserved keypoints