PhotoMetricDistort

Targets:

image

Image Types:uint8, float32

SSD-style photometric distortion: brightness, contrast, saturation, hue, channel shuffle; each with probability distort_p. For detection training.

Applies brightness, contrast, saturation, and hue adjustments independently with probability distort_p each. Contrast is applied either before or after the HSV-space adjustments (randomly chosen). Optionally permutes channels with probability distort_p.

This mirrors the RandomPhotometricDistort transform from torchvision but uses our existing adjust_*_torchvision functional primitives.

Arguments

brightness_range

tuple[float, float]

[0.875,1.125]

Multiplicative factor range for brightness. Factor is drawn uniformly from this range. Must be non-negative. Default: (0.875, 1.125).

contrast_range

tuple[float, float]

[0.5,1.5]

Multiplicative factor range for contrast. Factor is drawn uniformly from this range. Must be non-negative. Default: (0.5, 1.5).

saturation_range

tuple[float, float]

[0.5,1.5]

Multiplicative factor range for saturation. Factor is drawn uniformly from this range. Must be non-negative. Default: (0.5, 1.5).

hue_range

tuple[float, float]

[-0.05,0.05]

Additive factor range for hue. Factor is drawn uniformly from this range. Must be in [-0.5, 0.5]. Default: (-0.05, 0.05).

distort_p

float

0.5

Probability of applying each individual distortion (brightness, contrast, saturation, hue, channel permutation). Default: 0.5.

p

float

0.5

Probability of applying the overall transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50]], dtype=np.float32)
>>> bbox_labels = [1]
>>> keypoints = np.array([[20, 30]], dtype=np.float32)
>>> keypoint_labels = [0]
>>>
>>> transform = A.Compose([
...     A.PhotoMetricDistort(
...         brightness_range=(0.875, 1.125),
...         contrast_range=(0.5, 1.5),
...         saturation_range=(0.5, 1.5),
...         hue_range=(-0.05, 0.05),
...         distort_p=0.5,
...         p=1.0,
...     )
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> result = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels,
... )
>>> transformed_image = result['image']

Notes

Each of the five distortions (brightness, contrast, saturation, hue, channel shuffle) is applied independently with probability distort_p.
Contrast is randomly applied either before or after saturation/hue adjustment.
For single-channel images, saturation and hue adjustments have no effect.

References

SSDhttps://arxiv.org/abs/1512.02325
torchvision RandomPhotometricDistorthttps://pytorch.org/vision/stable/generated/torchvision.transforms.v2.RandomPhotometricDistort.html

>>> import numpy as np >>> import albumentations as A >>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) >>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8) >>> bboxes = np.array([[10, 10, 50, 50]], dtype=np.float32) >>> bbox_labels = [1] >>> keypoints = np.array([[20, 30]], dtype=np.float32) >>> keypoint_labels = [0] >>> >>> transform = A.Compose([ ... A.PhotoMetricDistort( ... brightness_range=(0.875, 1.125), ... contrast_range=(0.5, 1.5), ... saturation_range=(0.5, 1.5), ... hue_range=(-0.05, 0.05), ... distort_p=0.5, ... p=1.0, ... ) ... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']), ... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels'])) >>> >>> result = transform( ... image=image, ... mask=mask, ... bboxes=bboxes, ... bbox_labels=bbox_labels, ... keypoints=keypoints, ... keypoint_labels=keypoint_labels, ... ) >>> transformed_image = result['image']