Apply affine transformations: translation, rotation, scale, shear. Params: scale, translate, rotate, shear, interpolation, fill.
Affine transformations involve:
All such transformations can create "new" pixels in the image without a defined content, e.g.
if the image is translated to the left, pixels are created on the right.
A method has to be defined to deal with these pixel values.
The parameters fill and fill_mask of this class deal with this.
Some transformations involve interpolations between several pixels
of the input image to generate output pixel values. The parameters interpolation and
mask_interpolation deals with the method of interpolation used for this.
scaleScaling factor, where 1.0 denotes "no change".
(a, b), then a value will be uniformly sampled per image from [a, b]
and used identically for x- and y-axis."x" and/or "y", each entry must itself be a (a, b) tuple.
When keep_ratio=True, x and y ranges must be identical.translate_percentTranslation as a fraction of image
size, where 0 denotes "no change" and 0.5 denotes "half the axis size".
None, equivalent to 0.0 unless translate_px is set.(a, b), sampled value applies to both x- and y-axis."x" and/or "y", each entry must be a (a, b) tuple.
Sampling happens independently per axis.translate_pxTranslation in pixels. Same shape rules
as translate_percent. If None, equivalent to 0 unless translate_percent is set.
rotateRotation in degrees (NOT radians) around image center.
A (a, b) tuple from which the rotation angle is uniformly sampled.
shearShear in degrees (NOT radians).
(a, b), used as [a, b] for both x- and y-shear."x" and/or "y", each entry must be a (a, b) tuple.
Sampling happens independently per axis.interpolationOpenCV interpolation flag.
mask_interpolationOpenCV interpolation flag.
fillThe constant value to use when filling in newly created pixels.
(E.g. translating by 1px to the right will create a new 1px-wide column of pixels
on the left of the image).
The value is only used when mode=constant. The expected value range is [0, 255] for uint8 images.
fill_maskSame as fill but only for masks.
border_modeOpenCV border flag.
fit_outputIf True, the image plane size and position will be adjusted to tightly capture
the whole image after affine transformation (translate_percent and translate_px are ignored).
Otherwise (False), parts of the transformed image may end up outside the image plane.
Fitting the output shape can be useful to avoid corners of the image being outside the image plane
after applying rotations. Default: False
keep_ratioWhen True, the original aspect ratio will be kept when the random scale is applied. Default: True.
rotate_methodrotation method used for the bounding boxes. Should be one of "largest_box" or "ellipse"[1]. Default: "largest_box"
balanced_scaleWhen True, scaling factors are chosen to be either entirely below or above 1, ensuring balanced scaling. Default: False.
This is important because without it, scaling tends to lean towards upscaling. For example, if we want
the image to zoom in and out by 2x, we may pick an interval [0.5, 2]. Since the interval [0.5, 1] is
three times smaller than [1, 2], values above 1 are picked three times more often if sampled directly
from [0.5, 2]. With balanced_scale, the function ensures that half the time, the scaling
factor is picked from below 1 (zooming out), and the other half from above 1 (zooming in).
This makes the zooming in and out process more balanced.
pprobability of applying the transform. Default: 0.5.
>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Prepare sample data
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32)
>>> bbox_labels = [1, 2]
>>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32)
>>> keypoint_labels = [0, 1]
>>>
>>> # Define transform with different parameter types
>>> transform = A.Compose([
... A.Affine(
... # Tuple for scale (will be used for both x and y)
... scale=(0.8, 1.2),
... # Dictionary with tuples for different x/y translations
... translate_percent={"x": (-0.2, 0.2), "y": (-0.1, 0.1)},
... # Tuple for rotation range
... rotate=(-30, 30),
... # Dictionary with tuples for different x/y shearing
... shear={"x": (-10, 10), "y": (-5, 5)},
... # Interpolation methods
... interpolation=cv2.INTER_LINEAR,
... mask_interpolation=cv2.INTER_NEAREST,
... # Other parameters
... fit_output=False,
... keep_ratio=True,
... rotate_method="largest_box",
... balanced_scale=True,
... border_mode=cv2.BORDER_CONSTANT,
... fill=0,
... fill_mask=0,
... p=1.0
... ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform
>>> transformed = transform(
... image=image,
... mask=mask,
... bboxes=bboxes,
... bbox_labels=bbox_labels,
... keypoints=keypoints,
... keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed data
>>> transformed_image = transformed['image'] # Image with affine transforms applied
>>> transformed_mask = transformed['mask'] # Mask with affine transforms applied
>>> transformed_bboxes = transformed['bboxes'] # Bounding boxes with affine transforms applied
>>> transformed_bbox_labels = transformed['bbox_labels'] # Labels for transformed bboxes
>>> transformed_keypoints = transformed['keypoints'] # Keypoints with affine transforms applied
>>> transformed_keypoint_labels = transformed['keypoint_labels'] # Labels for transformed keypoints
>>>
>>> # Simpler example with only essential parameters
>>> simple_transform = A.Compose([
... A.Affine(
... scale=(1.1, 1.1),
... rotate=(15, 15),
... translate_px=(30, 30),
... p=1.0
... ),
... ])
>>> simple_result = simple_transform(image=image)
>>> simple_transformed = simple_result['image']