ToGray
Targets:
image
volume
Image Types:uint8, float32
Convert an image to grayscale and optionally replicate the grayscale channel.
This transform first converts a color image to a single-channel grayscale image using various methods, then replicates the grayscale channel if num_output_channels is greater than 1.
Arguments
num_output_channelsint
3
The number of channels in the output image. If greater than 1, the grayscale channel will be replicated. Default: 3.
methodweighted_average | from_lab | desaturation | average | max | pca
weighted_average
The method used for grayscale conversion:
- "weighted_average": Uses a weighted sum of RGB channels (0.299R + 0.587G + 0.114B). Works only with 3-channel images. Provides realistic results based on human perception.
- "from_lab": Extracts the L channel from the LAB color space. Works only with 3-channel images. Gives perceptually uniform results.
- "desaturation": Averages the maximum and minimum values across channels. Works with any number of channels. Fast but may not preserve perceived brightness well.
- "average": Simple average of all channels. Works with any number of channels. Fast but may not give realistic results.
- "max": Takes the maximum value across all channels. Works with any number of channels. Tends to produce brighter results.
- "pca": Applies Principal Component Analysis to reduce channels. Works with any number of channels. Can preserve more information but is computationally intensive.
pfloat
0.5
Probability of applying the transform. Default: 0.5.
Examples
>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create a sample color image with distinct RGB values
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> # Red square in top-left
>>> image[10:40, 10:40, 0] = 200
>>> # Green square in top-right
>>> image[10:40, 60:90, 1] = 200
>>> # Blue square in bottom-left
>>> image[60:90, 10:40, 2] = 200
>>> # Yellow square in bottom-right (Red + Green)
>>> image[60:90, 60:90, 0] = 200
>>> image[60:90, 60:90, 1] = 200
>>>
>>> # Example 1: Default conversion (weighted average, 3 channels)
>>> transform = A.ToGray(p=1.0)
>>> result = transform(image=image)
>>> gray_image = result['image']
>>> # Output has 3 duplicate channels with values based on RGB perception weights
>>> # R=0.299, G=0.587, B=0.114
>>> assert gray_image.shape == (100, 100, 3)
>>> assert np.allclose(gray_image[:, :, 0], gray_image[:, :, 1])
>>> assert np.allclose(gray_image[:, :, 1], gray_image[:, :, 2])
>>>
>>> # Example 2: Single-channel output
>>> transform = A.ToGray(num_output_channels=1, p=1.0)
>>> result = transform(image=image)
>>> gray_image = result['image']
>>> assert gray_image.shape == (100, 100, 1)
>>>
>>> # Example 3: Using different conversion methods
>>> # "desaturation" method (min+max)/2
>>> transform_desaturate = A.ToGray(
... method="desaturation",
... p=1.0
... )
>>> result = transform_desaturate(image=image)
>>> gray_desaturate = result['image']
>>>
>>> # "from_lab" method (using L channel from LAB colorspace)
>>> transform_lab = A.ToGray(
... method="from_lab",
... p=1.0
>>> )
>>> result = transform_lab(image=image)
>>> gray_lab = result['image']
>>>
>>> # "average" method (simple average of channels)
>>> transform_avg = A.ToGray(
... method="average",
... p=1.0
>>> )
>>> result = transform_avg(image=image)
>>> gray_avg = result['image']
>>>
>>> # "max" method (takes max value across channels)
>>> transform_max = A.ToGray(
... method="max",
... p=1.0
>>> )
>>> result = transform_max(image=image)
>>> gray_max = result['image']
>>>
>>> # Example 4: Using grayscale in an augmentation pipeline
>>> pipeline = A.Compose([
... A.ToGray(p=0.5), # 50% chance of grayscale conversion
... A.RandomBrightnessContrast(p=1.0) # Always apply brightness/contrast
... ])
>>> result = pipeline(image=image)
>>> augmented_image = result['image'] # May be grayscale or color
>>>
>>> # Example 5: Converting float32 image
>>> float_image = image.astype(np.float32) / 255.0 # Range [0, 1]
>>> transform = A.ToGray(p=1.0)
>>> result = transform(image=float_image)
>>> gray_float_image = result['image']
>>> assert gray_float_image.dtype == np.float32
>>> assert gray_float_image.max() <= 1.0Returns
np.ndarray
Grayscale image with the specified number of channels.
Notes
- The transform first converts the input image to single-channel grayscale, then replicates this channel if num_output_channels > 1.
- "weighted_average" and "from_lab" are typically used in image processing and computer vision applications where accurate representation of human perception is important.
- "desaturation" and "average" are often used in simple image manipulation tools or when computational speed is a priority.
- "max" method can be useful in scenarios where preserving bright features is important, such as in some medical imaging applications.
- "pca" might be used in advanced image analysis tasks or when dealing with hyperspectral images.