CopyAndPaste

Targets:

image

mask

bboxes

keypoints

Image Types:uint8, float32

Paste object instances onto the primary image, updating all annotations (instance masks, bboxes, keypoints). Designed for instance segmentation training.

Each donor object is tight-cropped to its mask (or bbox rect for bbox-only donors, optionally expanded to include keypoints), shrunk to fit the target image with aspect preserved (no upscaling), optionally jittered by scale_range, and stamped at a uniformly random location inside the target. Existing instances that become sufficiently occluded by pasted objects are removed from annotations.

All per-object content augmentation (rotation, flip, color jitter, scale-up beyond fit) is the user's responsibility — the transform only does crop -> shrink-fit -> optional scale jitter -> uniform random placement -> stamp.

Note: Most Copy-Paste implementations (e.g. detectron2) accept a single donor image with all its instance masks and internally sample a random subset of instances to paste, coupling donor selection, instance sampling, and pasting into one opaque step. This implementation separates those concerns: donor selection and instance selection are done by the user externally, and the transform pastes every object in the provided list. The metadata format is list[dict] (one dict per object), consistent with Mosaic.

Arguments

min_visibility_after_paste

float

0.05

Minimum mask area ratio (area_after / area_before) for an existing instance to survive after occlusion by pasted objects. Instances whose remaining visible area falls below this threshold are removed from masks and bboxes. Default: 0.05.

blend_mode

hard | gaussian

hard

How to blend pasted pixels. "hard" does direct pixel copy (paper default). "gaussian" applies gaussian blur to the alpha mask for soft edges at instance boundaries. Default: "hard".

blend_sigma_range

tuple[float, float]

[1,3]

Sigma range for gaussian blur when blend_mode="gaussian". Ignored when blend_mode="hard". Default: (1.0, 3.0).

scale_range

tuple[float, float]

[1,1]

Multiplicative scale jitter applied on top of the shrink-to-fit scale. Sampled uniformly from this range and capped at the fit scale, so the result can shrink the donor further but never exceed fit-to-target. Default: (1.0, 1.0) (pure shrink-to-fit, no jitter).

min_paste_area

int

Minimum scaled paste footprint area (pixels). Donors whose final scaled H*W falls below this value are silently dropped — useful to avoid pasting tiny blob-noise from huge donors onto small targets. Default: 1.

metadata_key

str

copy_paste_metadata

Key in the Compose call data dict containing the list of object dictionaries to paste. Default: "copy_paste_metadata".

p

float

0.5

Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>>
>>> # Primary data (target image is 100x100)
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> instance_masks = np.zeros((1, 100, 100), dtype=np.uint8)
>>> instance_masks[0, 10:30, 10:30] = 1
>>> bboxes = np.array([[10, 10, 30, 30]], dtype=np.float32)
>>> class_labels = [1]
>>>
>>> # Donor 1: tight 40x40 mask-based donor (any donor dims work).
>>> donor1_image = np.full((40, 40, 3), 200, dtype=np.uint8)
>>> donor1_mask = np.ones((40, 40), dtype=np.uint8)
>>>
>>> # Donor 2: bbox-only donor on a 60x80 image (rectangle paste footprint).
>>> donor2_image = np.random.randint(0, 256, (60, 80, 3), dtype=np.uint8)
>>>
>>> transform = A.Compose([
...     A.CopyAndPaste(
...         min_visibility_after_paste=0.05,
...         scale_range=(0.5, 1.0),  # randomly shrink donors to 50%-100% of fit
...         p=1.0,
...     ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['class_labels']))
>>>
>>> result = transform(
...     image=image,
...     masks=instance_masks,
...     bboxes=bboxes,
...     class_labels=class_labels,
...     copy_paste_metadata=[
...         {
...             'image': donor1_image,
...             'mask': donor1_mask,
...             'bbox_labels': {'class_labels': 2},
...         },
...         {
...             'image': donor2_image,
...             'bbox': [10, 5, 70, 55],  # pascal_voc on 60x80 donor dims
...             'bbox_labels': {'class_labels': 3},
...         },
...     ],
... )
>>> result_image = result['image']
>>> result_masks = result['masks']         # (N_surviving + K, H, W)
>>> result_bboxes = result['bboxes']       # Updated bboxes (in pascal_voc, target dims)
>>> result_labels = result['class_labels'] # Updated labels

Notes

Most Copy-Paste implementations (e.g. detectron2) accept a single donor image with all its instance masks and internally sample a random subset of instances to paste, coupling donor selection, instance sampling, and pasting into one opaque step. This implementation separates those concerns: donor selection and instance selection are done by the user externally, and the transform pastes every object in the provided list. The metadata format is list[dict] (one dict per object), consistent with Mosaic.

>>> import numpy as np >>> import albumentations as A >>> >>> # Primary data (target image is 100x100) >>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) >>> instance_masks = np.zeros((1, 100, 100), dtype=np.uint8) >>> instance_masks[0, 10:30, 10:30] = 1 >>> bboxes = np.array([[10, 10, 30, 30]], dtype=np.float32) >>> class_labels = [1] >>> >>> # Donor 1: tight 40x40 mask-based donor (any donor dims work). >>> donor1_image = np.full((40, 40, 3), 200, dtype=np.uint8) >>> donor1_mask = np.ones((40, 40), dtype=np.uint8) >>> >>> # Donor 2: bbox-only donor on a 60x80 image (rectangle paste footprint). >>> donor2_image = np.random.randint(0, 256, (60, 80, 3), dtype=np.uint8) >>> >>> transform = A.Compose([ ... A.CopyAndPaste( ... min_visibility_after_paste=0.05, ... scale_range=(0.5, 1.0), # randomly shrink donors to 50%-100% of fit ... p=1.0, ... ), ... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['class_labels'])) >>> >>> result = transform( ... image=image, ... masks=instance_masks, ... bboxes=bboxes, ... class_labels=class_labels, ... copy_paste_metadata=[ ... { ... 'image': donor1_image, ... 'mask': donor1_mask, ... 'bbox_labels': {'class_labels': 2}, ... }, ... { ... 'image': donor2_image, ... 'bbox': [10, 5, 70, 55], # pascal_voc on 60x80 donor dims ... 'bbox_labels': {'class_labels': 3}, ... }, ... ], ... ) >>> result_image = result['image'] >>> result_masks = result['masks'] # (N_surviving + K, H, W) >>> result_bboxes = result['bboxes'] # Updated bboxes (in pascal_voc, target dims) >>> result_labels = result['class_labels'] # Updated labels