ShiftScaleRotate

Targets:

image

mask

bboxes

keypoints

volume

mask3d

Image Types:uint8, float32

Randomly apply affine transforms: translate, scale and rotate the input.

Arguments

shift_limit

tuple[float, float] | float

[-0.0625,0.0625]

shift factor range for both height and width. If shift_limit is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and upper bounds should lie in range [-1, 1]. Default: (-0.0625, 0.0625).

scale_limit

tuple[float, float] | float

[-0.1,0.1]

scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).

rotate_limit

tuple[float, float] | float

[-45,45]

rotation range. If rotate_limit is a single int value, the range will be (-rotate_limit, rotate_limit). Default: (-45, 45).

interpolation

0 | 1 | 2 | 3 | 4

flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.

border_mode

0 | 1 | 2 | 3 | 4

flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_CONSTANT

fill

tuple[float, ...] | float

padding value if border_mode is cv2.BORDER_CONSTANT.

fill_mask

tuple[float, ...] | float

padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.

shift_limit_x

tuple[float, float] | float | None

shift factor range for width. If it is set then this value instead of shift_limit will be used for shifting width. If shift_limit_x is a single float value, the range will be (-shift_limit_x, shift_limit_x). Absolute values for lower and upper bounds should lie in the range [-1, 1]. Default: None.

shift_limit_y

tuple[float, float] | float | None

shift factor range for height. If it is set then this value instead of shift_limit will be used for shifting height. If shift_limit_y is a single float value, the range will be (-shift_limit_y, shift_limit_y). Absolute values for lower and upper bounds should lie in the range [-, 1]. Default: None.

rotate_method

largest_box | ellipse

largest_box

rotation method used for the bounding boxes. Should be one of "largest_box" or "ellipse". Default: "largest_box"

mask_interpolation

0 | 1 | 2 | 3 | 4

Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.

p

float

0.5

probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Prepare sample data
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32)
>>> bbox_labels = [1, 2]
>>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32)
>>> keypoint_labels = [0, 1]
>>>
>>> # Define transform with parameters as tuples when possible
>>> transform = A.Compose([
...     A.ShiftScaleRotate(
...         shift_limit=(-0.0625, 0.0625),
...         scale_limit=(-0.1, 0.1),
...         rotate_limit=(-45, 45),
...         interpolation=cv2.INTER_LINEAR,
...         border_mode=cv2.BORDER_CONSTANT,
...         rotate_method="largest_box",
...         p=1.0
...     ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform
>>> transformed = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed data
>>> transformed_image = transformed['image']      # Shifted, scaled and rotated image
>>> transformed_mask = transformed['mask']        # Shifted, scaled and rotated mask
>>> transformed_bboxes = transformed['bboxes']    # Shifted, scaled and rotated bounding boxes
>>> transformed_bbox_labels = transformed['bbox_labels']  # Labels for transformed bboxes
>>> transformed_keypoints = transformed['keypoints']  # Shifted, scaled and rotated keypoints
>>> transformed_keypoint_labels = transformed['keypoint_labels']  # Labels for transformed keypoints