← Back to all transforms

TextImage

Description

Apply text rendering transformations on images.

    This class supports rendering text directly onto images using a variety of configurations,
    such as custom fonts, font sizes, colors, and augmentation methods. The text can be placed
    inside specified bounding boxes.

    Args:
        font_path (str | Path): Path to the font file to use for rendering text.
        stopwords (list[str] | None): List of stopwords for text augmentation.
        augmentations (tuple[str | None, ...] | list[str | None]): List of text augmentations to apply.
            None: text is printed as is
            "insertion": insert random stop words into the text.
            "swap": swap random words in the text.
            "deletion": delete random words from the text.
        fraction_range (tuple[float, float]): Range for selecting a fraction of bounding boxes to modify.
        font_size_fraction_range (tuple[float, float]): Range for selecting the font size as a fraction of
            bounding box height.
        font_color (list[str] | str): List of possible font colors or a single font color.
        clear_bg (bool): Whether to clear the background before rendering text.
        metadata_key (str): Key to access metadata in the parameters.
        p (float): Probability of applying the transform.

    Targets:
        image

    Image types:
        uint8, float32

    Reference:
        https://github.com/danaaubakirova/doc-augmentation

    Examples:
        >>> import albumentations as A
        >>> transform = A.Compose([
            A.TextImage(
                font_path=Path("/path/to/font.ttf"),
                stopwords=["the", "is", "in"],
                augmentations=("insertion", "deletion"),
                fraction_range=(0.5, 1.0),
                font_size_fraction_range=(0.5, 0.9),
                font_color=["red", "green", "blue"],
                metadata_key="text_metadata",
                p=0.5
            )
        ])
        >>> transformed = transform(image=my_image, text_metadata=my_metadata)
        >>> image = transformed['image']
        # This will render text on `my_image` based on the metadata provided in `my_metadata`.
    

Parameters

  • p: float (default: 0.5)
  • font_path: str (default: null)
  • stopwords: list[str] | None (default: null)
  • augmentations: tuple[str | None, ...] | list[str | None] (default: (null))
  • fraction_range: tuple[float, float] (default: (1, 1))
  • font_size_fraction_range: tuple[float, float] (default: (0.8, 0.9))
  • font_color: list[float | Sequence[float] | str] | float | Sequence[float] | str (default: 'black')
  • clear_bg: bool (default: false)
  • metadata_key: str (default: 'textimage_metadata')

Targets

  • Image

Try it out

Original Image (width = 484, height = 733):

Original

Transformed Image:

Transform not yet applied