PhotoMetricDistort

Targets:
image
volume
Image Types:uint8, float32

Randomly distorts an image's photometric properties, as used in SSD object detection training.

Applies brightness, contrast, saturation, and hue adjustments independently with probability distort_p each. Contrast is applied either before or after the HSV-space adjustments (randomly chosen). Optionally permutes channels with probability distort_p.

This mirrors the RandomPhotometricDistort transform from torchvision but uses our existing adjust_*_torchvision functional primitives.

Arguments
brightness_range
tuple[float, float]
[0.875,1.125]

Multiplicative factor range for brightness. Factor is drawn uniformly from this range. Must be non-negative. Default: (0.875, 1.125).

contrast_range
tuple[float, float]
[0.5,1.5]

Multiplicative factor range for contrast. Factor is drawn uniformly from this range. Must be non-negative. Default: (0.5, 1.5).

saturation_range
tuple[float, float]
[0.5,1.5]

Multiplicative factor range for saturation. Factor is drawn uniformly from this range. Must be non-negative. Default: (0.5, 1.5).

hue_range
tuple[float, float]
[-0.05,0.05]

Additive factor range for hue. Factor is drawn uniformly from this range. Must be in [-0.5, 0.5]. Default: (-0.05, 0.05).

distort_p
float
0.5

Probability of applying each individual distortion (brightness, contrast, saturation, hue, channel permutation). Default: 0.5.

p
float
0.5

Probability of applying the overall transform. Default: 0.5.

Examples
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50]], dtype=np.float32)
>>> bbox_labels = [1]
>>> keypoints = np.array([[20, 30]], dtype=np.float32)
>>> keypoint_labels = [0]
>>>
>>> transform = A.Compose([
...     A.PhotoMetricDistort(
...         brightness_range=(0.875, 1.125),
...         contrast_range=(0.5, 1.5),
...         saturation_range=(0.5, 1.5),
...         hue_range=(-0.05, 0.05),
...         distort_p=0.5,
...         p=1.0,
...     )
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> result = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels,
... )
>>> transformed_image = result['image']
Notes
  • Each of the five distortions (brightness, contrast, saturation, hue, channel shuffle) is applied independently with probability distort_p.
  • Contrast is randomly applied either before or after saturation/hue adjustment.
  • For single-channel images, saturation and hue adjustments have no effect.