Replace each pixel with the most frequent value (mode) in its local neighborhood, computed per channel. Useful for quantised, palette-like, or cartoon imagery.
Unlike median blur (order-statistic) or box blur (averaging), mode filtering is frequency-based: it picks the value that appears most often in the window. This preserves and expands dominant flat regions while suppressing isolated outliers, making it well-suited for palette-like, synthetic, or segmentation-style imagery.
For float32 images the operation is performed in uint8 space (via @uint8_io); quantisation is intentional — mode is meaningless for continuous-valued signals.
Tie-breaking: when multiple values share the highest frequency, the smallest is chosen (deterministic, scipy.stats.mode default). Border pixels use reflect padding.
kernel_rangeRange of square kernel sizes to sample from. Both bounds must be odd integers ≥ 3. Even values raise a UserWarning and are automatically bumped to the next odd number. Default: (3, 7).
pProbability of applying the transform. Default: 0.5.
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50]], dtype=np.float32)
>>> bbox_labels = [1]
>>> keypoints = np.array([[20, 30]], dtype=np.float32)
>>> keypoint_labels = [0]
>>>
>>> transform = A.Compose([
... A.ModeFilter(kernel_range=(3, 7), p=1.0)
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> result = transform(
... image=image,
... mask=mask,
... bboxes=bboxes,
... bbox_labels=bbox_labels,
... keypoints=keypoints,
... keypoint_labels=keypoint_labels,
... )