RandomResizedCrop
Crop a random part of the input and rescale it to a specified size.
This transform first crops a random portion of the input image (or mask, bounding boxes, keypoints) and then resizes the crop to a specified size. It's particularly useful for training neural networks on images of varying sizes and aspect ratios.
sizeTarget size for the output image, i.e. (height, width) after crop and resize.
scaleRange of the random size of the crop relative to the input size. For example, (0.08, 1.0) means the crop size will be between 8% and 100% of the input size. Default: (0.08, 1.0)
ratioRange of aspect ratios of the random crop. For example, (0.75, 1.3333) allows crop aspect ratios from 3:4 to 4:3. Default: (0.75, 1.3333333333333333)
interpolationFlag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR
mask_interpolationFlag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST
area_for_downscaleControls automatic use of INTER_AREA interpolation for downscaling. Options:
- None: No automatic interpolation selection, always use the specified interpolation method
- "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks
- "image_mask": Use INTER_AREA when downscaling both images and masks Default: None.
pProbability of applying the transform. Default: 1.0
>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Prepare sample data
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32)
>>> bbox_labels = [1, 2]
>>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32)
>>> keypoint_labels = [0, 1]
>>>
>>> # Define transform with parameters as tuples
>>> transform = A.Compose([
... A.RandomResizedCrop(
... size=(64, 64),
... scale=(0.5, 0.9), # Crop size will be 50-90% of original image
... ratio=(0.75, 1.33), # Aspect ratio will vary from 3:4 to 4:3
... interpolation=cv2.INTER_LINEAR,
... mask_interpolation=cv2.INTER_NEAREST,
... area_for_downscale="image", # Use INTER_AREA for image downscaling
... p=1.0
... ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform
>>> transformed = transform(
... image=image,
... mask=mask,
... bboxes=bboxes,
... bbox_labels=bbox_labels,
... keypoints=keypoints,
... keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed data
>>> transformed_image = transformed['image'] # Shape: (64, 64, 3)
>>> transformed_mask = transformed['mask'] # Shape: (64, 64)
>>> transformed_bboxes = transformed['bboxes'] # Bounding boxes adjusted to new crop and size
>>> transformed_bbox_labels = transformed['bbox_labels'] # Labels for the preserved bboxes
>>> transformed_keypoints = transformed['keypoints'] # Keypoints adjusted to new crop and size
>>> transformed_keypoint_labels = transformed['keypoint_labels'] # Labels for the preserved keypoints- This transform attempts to crop a random area with an aspect ratio and relative size specified by 'ratio' and 'scale' parameters. If it fails to find a suitable crop after 10 attempts, it will return a crop from the center of the image.
- The crop's aspect ratio is defined as width / height.
- Bounding boxes that end up fully outside the cropped area will be removed.
- Keypoints that end up outside the cropped area will be removed.
- After cropping, the result is resized to the specified size.
- When area_for_downscale is set, INTER_AREA interpolation will be used automatically for downscaling (when the crop is larger than the target size), which provides better quality for size reduction.