RandomGridShuffle

Targets:

image

mask

bboxes

keypoints

volume

mask3d

Image Types:uint8, float32

Randomly shuffles the grid's cells on an image, mask, or keypoints, effectively rearranging patches within the image. This transformation divides the image into a grid and then permutes these grid cells based on a random mapping.

Arguments

grid

tuple[int, int]

[3,3]

Size of the grid for splitting the image into cells. Each cell is shuffled randomly. For example, (3, 3) will divide the image into a 3x3 grid, resulting in 9 cells to be shuffled. Default: (3, 3)

p

float

0.5

Probability that the transform will be applied. Should be in the range [0, 1]. Default: 0.5

Examples

>>> import numpy as np
>>> import albumentations as A
>>> # Prepare sample data
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32)
>>> bbox_labels = [1, 2]
>>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32)
>>> keypoint_labels = [0, 1]
>>>
>>> # Define transform with grid as a tuple
>>> transform = A.Compose([
...     A.RandomGridShuffle(grid=(3, 3), p=1.0),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform
>>> transformed = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed data
>>> transformed_image = transformed['image']     # Grid-shuffled image
>>> transformed_mask = transformed['mask']       # Grid-shuffled mask
>>> transformed_bboxes = transformed['bboxes']   # Grid-shuffled bounding boxes
>>> transformed_keypoints = transformed['keypoints']  # Grid-shuffled keypoints
>>>
>>> # Visualization example with a simpler grid
>>> simple_image = np.array([
...     [1, 1, 1, 2, 2, 2],
...     [1, 1, 1, 2, 2, 2],
...     [1, 1, 1, 2, 2, 2],
...     [3, 3, 3, 4, 4, 4],
...     [3, 3, 3, 4, 4, 4],
...     [3, 3, 3, 4, 4, 4]
... ])
>>> simple_transform = A.RandomGridShuffle(grid=(2, 2), p=1.0)
>>> simple_result = simple_transform(image=simple_image)
>>> simple_transformed = simple_result['image']
>>> # The result could look like:
>>> # array([[4, 4, 4, 2, 2, 2],
>>> #        [4, 4, 4, 2, 2, 2],
>>> #        [4, 4, 4, 2, 2, 2],
>>> #        [3, 3, 3, 1, 1, 1],
>>> #        [3, 3, 3, 1, 1, 1],
>>> #        [3, 3, 3, 1, 1, 1]])

Notes

This transform maintains consistency across all targets. If applied to an image and its corresponding mask or keypoints, the same shuffling will be applied to all.
The number of cells in the grid should be at least 2 (i.e., grid should be at least (1, 2), (2, 1), or (2, 2)) for the transform to have any effect.
Keypoints are moved along with their corresponding grid cell.
This transform could be useful when only micro features are important for the model, and memorizing the global structure could be harmful. For example:
- Identifying the type of cell phone used to take a picture based on micro artifacts generated by phone post-processing algorithms, rather than the semantic features of the photo. See more at https://ieeexplore.ieee.org/abstract/document/8622031
- Identifying stress, glucose, hydration levels based on skin images.