RandomSizedCrop
Crop a random part of the input and rescale it to a specific size.
This transform first crops a random portion of the input and then resizes it to a specified size. The size of the random crop is controlled by the 'min_max_height' parameter.
min_max_heightMinimum and maximum height of the crop in pixels.
sizeTarget size for the output image, i.e. (height, width) after crop and resize.
w2h_ratioAspect ratio (width/height) of crop. Default: 1.0
interpolationFlag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
mask_interpolationFlag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.
area_for_downscaleControls automatic use of INTER_AREA interpolation for downscaling. Options:
- None: No automatic interpolation selection, always use the specified interpolation method
- "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks
- "image_mask": Use INTER_AREA when downscaling both images and masks Default: None.
pProbability of applying the transform. Default: 1.0
>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Prepare sample data
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32)
>>> bbox_labels = [1, 2]
>>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32)
>>> keypoint_labels = [0, 1]
>>>
>>> # Define transform with parameters as tuples
>>> transform = A.Compose([
... A.RandomSizedCrop(
... min_max_height=(50, 80),
... size=(64, 64),
... w2h_ratio=1.0,
... interpolation=cv2.INTER_LINEAR,
... mask_interpolation=cv2.INTER_NEAREST,
... area_for_downscale="image", # Use INTER_AREA for image downscaling
... p=1.0
... ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform
>>> transformed = transform(
... image=image,
... mask=mask,
... bboxes=bboxes,
... bbox_labels=bbox_labels,
... keypoints=keypoints,
... keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed data
>>> transformed_image = transformed['image'] # Shape: (64, 64, 3)
>>> transformed_mask = transformed['mask'] # Shape: (64, 64)
>>> transformed_bboxes = transformed['bboxes'] # Bounding boxes adjusted to new crop and size
>>> transformed_bbox_labels = transformed['bbox_labels'] # Labels for the preserved bboxes
>>> transformed_keypoints = transformed['keypoints'] # Keypoints adjusted to new crop and size
>>> transformed_keypoint_labels = transformed['keypoint_labels'] # Labels for the preserved keypoints- The crop size is randomly selected for each execution within the range specified by 'min_max_height'.
- The aspect ratio of the crop is determined by the 'w2h_ratio' parameter.
- After cropping, the result is resized to the specified 'size'.
- Bounding boxes that end up fully outside the cropped area will be removed.
- Keypoints that end up outside the cropped area will be removed.
- This transform differs from RandomResizedCrop in that it allows more control over the crop size through the 'min_max_height' parameter, rather than using a scale parameter.
- When area_for_downscale is set, INTER_AREA interpolation will be used automatically for downscaling (when the crop is larger than the target size), which provides better quality for size reduction.