ToGray

Targets:

image

Image Types:uint8, float32

Convert to grayscale (weighted by channel weights). Optionally replicate to keep shape. Useful for grayscale training or channel reduction.

This transform first converts a color image to a single-channel grayscale image using various methods, then replicates the grayscale channel if num_output_channels is greater than 1.

Arguments

num_output_channels

int

The number of channels in the output image. If greater than 1, the grayscale channel will be replicated. Default: 3.

method

weighted_average

The method used for grayscale conversion:

"weighted_average": Uses a weighted sum of RGB channels (0.299R + 0.587G + 0.114B). Works only with 3-channel images. Provides realistic results based on human perception.
"from_lab": Extracts the L channel from the LAB color space. Works only with 3-channel images. Gives perceptually uniform results.
"desaturation": Averages the maximum and minimum values across channels. Works with any number of channels. Fast but may not preserve perceived brightness well.
"average": Simple average of all channels. Works with any number of channels. Fast but may not give realistic results.
"max": Takes the maximum value across all channels. Works with any number of channels. Tends to produce brighter results.
"pca": Applies Principal Component Analysis to reduce channels. Works with any number of channels. Can preserve more information but is computationally intensive.

p

float

0.5

Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create a sample color image with distinct RGB values
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> # Red square in top-left
>>> image[10:40, 10:40, 0] = 200
>>> # Green square in top-right
>>> image[10:40, 60:90, 1] = 200
>>> # Blue square in bottom-left
>>> image[60:90, 10:40, 2] = 200
>>> # Yellow square in bottom-right (Red + Green)
>>> image[60:90, 60:90, 0] = 200
>>> image[60:90, 60:90, 1] = 200
>>>
>>> # Example 1: Default conversion (weighted average, 3 channels)
>>> transform = A.ToGray(p=1.0)
>>> result = transform(image=image)
>>> gray_image = result['image']
>>> # Output has 3 duplicate channels with values based on RGB perception weights
>>> # R=0.299, G=0.587, B=0.114
>>> assert gray_image.shape == (100, 100, 3)
>>> assert np.allclose(gray_image[:, :, 0], gray_image[:, :, 1])
>>> assert np.allclose(gray_image[:, :, 1], gray_image[:, :, 2])
>>>
>>> # Example 2: Single-channel output
>>> transform = A.ToGray(num_output_channels=1, p=1.0)
>>> result = transform(image=image)
>>> gray_image = result['image']
>>> assert gray_image.shape == (100, 100, 1)
>>>
>>> # Example 3: Using different conversion methods
>>> # "desaturation" method (min+max)/2
>>> transform_desaturate = A.ToGray(
...     method="desaturation",
...     p=1.0
... )
>>> result = transform_desaturate(image=image)
>>> gray_desaturate = result['image']
>>>
>>> # "from_lab" method (using L channel from LAB colorspace)
>>> transform_lab = A.ToGray(
...     method="from_lab",
...     p=1.0
>>> )
>>> result = transform_lab(image=image)
>>> gray_lab = result['image']
>>>
>>> # "average" method (simple average of channels)
>>> transform_avg = A.ToGray(
...     method="average",
...     p=1.0
>>> )
>>> result = transform_avg(image=image)
>>> gray_avg = result['image']
>>>
>>> # "max" method (takes max value across channels)
>>> transform_max = A.ToGray(
...     method="max",
...     p=1.0
>>> )
>>> result = transform_max(image=image)
>>> gray_max = result['image']
>>>
>>> # Example 4: Using grayscale in an augmentation pipeline
>>> pipeline = A.Compose([
...     A.ToGray(p=0.5),           # 50% chance of grayscale conversion
...     A.RandomBrightnessContrast(p=1.0)  # Always apply brightness/contrast
... ])
>>> result = pipeline(image=image)
>>> augmented_image = result['image']  # May be grayscale or color
>>>
>>> # Example 5: Converting float32 image
>>> float_image = image.astype(np.float32) / 255.0  # Range [0, 1]
>>> transform = A.ToGray(p=1.0)
>>> result = transform(image=float_image)
>>> gray_float_image = result['image']
>>> assert gray_float_image.dtype == np.float32
>>> assert gray_float_image.max() <= 1.0

Returns

np.ndarray

Grayscale image with the specified number of channels.

Notes

The transform first converts the input image to single-channel grayscale, then replicates this channel if num_output_channels > 1.
"weighted_average" and "from_lab" are typically used in image processing and computer vision applications where accurate representation of human perception is important.
"desaturation" and "average" are often used in simple image manipulation tools or when computational speed is a priority.
"max" method can be useful in scenarios where preserving bright features is important, such as in some medical imaging applications.
"pca" might be used in advanced image analysis tasks or when dealing with hyperspectral images.

>>> import numpy as np >>> import albumentations as A >>> import cv2 >>> >>> # Create a sample color image with distinct RGB values >>> image = np.zeros((100, 100, 3), dtype=np.uint8) >>> # Red square in top-left >>> image[10:40, 10:40, 0] = 200 >>> # Green square in top-right >>> image[10:40, 60:90, 1] = 200 >>> # Blue square in bottom-left >>> image[60:90, 10:40, 2] = 200 >>> # Yellow square in bottom-right (Red + Green) >>> image[60:90, 60:90, 0] = 200 >>> image[60:90, 60:90, 1] = 200 >>> >>> # Example 1: Default conversion (weighted average, 3 channels) >>> transform = A.ToGray(p=1.0) >>> result = transform(image=image) >>> gray_image = result['image'] >>> # Output has 3 duplicate channels with values based on RGB perception weights >>> # R=0.299, G=0.587, B=0.114 >>> assert gray_image.shape == (100, 100, 3) >>> assert np.allclose(gray_image[:, :, 0], gray_image[:, :, 1]) >>> assert np.allclose(gray_image[:, :, 1], gray_image[:, :, 2]) >>> >>> # Example 2: Single-channel output >>> transform = A.ToGray(num_output_channels=1, p=1.0) >>> result = transform(image=image) >>> gray_image = result['image'] >>> assert gray_image.shape == (100, 100, 1) >>> >>> # Example 3: Using different conversion methods >>> # "desaturation" method (min+max)/2 >>> transform_desaturate = A.ToGray( ... method="desaturation", ... p=1.0 ... ) >>> result = transform_desaturate(image=image) >>> gray_desaturate = result['image'] >>> >>> # "from_lab" method (using L channel from LAB colorspace) >>> transform_lab = A.ToGray( ... method="from_lab", ... p=1.0 >>> ) >>> result = transform_lab(image=image) >>> gray_lab = result['image'] >>> >>> # "average" method (simple average of channels) >>> transform_avg = A.ToGray( ... method="average", ... p=1.0 >>> ) >>> result = transform_avg(image=image) >>> gray_avg = result['image'] >>> >>> # "max" method (takes max value across channels) >>> transform_max = A.ToGray( ... method="max", ... p=1.0 >>> ) >>> result = transform_max(image=image) >>> gray_max = result['image'] >>> >>> # Example 4: Using grayscale in an augmentation pipeline >>> pipeline = A.Compose([ ... A.ToGray(p=0.5), # 50% chance of grayscale conversion ... A.RandomBrightnessContrast(p=1.0) # Always apply brightness/contrast ... ]) >>> result = pipeline(image=image) >>> augmented_image = result['image'] # May be grayscale or color >>> >>> # Example 5: Converting float32 image >>> float_image = image.astype(np.float32) / 255.0 # Range [0, 1] >>> transform = A.ToGray(p=1.0) >>> result = transform(image=float_image) >>> gray_float_image = result['image'] >>> assert gray_float_image.dtype == np.float32 >>> assert gray_float_image.max() <= 1.0