Paste object instances onto the primary image, updating all annotations (instance masks, bboxes, keypoints). Designed for instance segmentation training.
Each donor object is tight-cropped to its mask (or bbox rect for bbox-only donors,
optionally expanded to include keypoints), shrunk to fit the target image with aspect
preserved (no upscaling), optionally jittered by scale_range, and stamped at a uniformly
random location inside the target. Existing instances that become sufficiently occluded by
pasted objects are removed from annotations.
All per-object content augmentation (rotation, flip, color jitter, scale-up beyond fit) is the user's responsibility — the transform only does crop -> shrink-fit -> optional scale jitter -> uniform random placement -> stamp.
Note:
Most Copy-Paste implementations (e.g. detectron2) accept a single donor image with all
its instance masks and internally sample a random subset of instances to paste, coupling
donor selection, instance sampling, and pasting into one opaque step. This implementation
separates those concerns: donor selection and instance selection are done by the user
externally, and the transform pastes every object in the provided list. The metadata
format is list[dict] (one dict per object), consistent with Mosaic.
min_visibility_after_pasteMinimum mask area ratio (area_after / area_before) for an existing instance to survive after occlusion by pasted objects. Instances whose remaining visible area falls below this threshold are removed from masks and bboxes. Default: 0.05.
blend_modeHow to blend pasted pixels. "hard" does direct pixel copy (paper default). "gaussian" applies gaussian blur to the alpha mask for soft edges at instance boundaries. Default: "hard".
blend_sigma_rangeSigma range for gaussian blur when blend_mode="gaussian". Ignored when blend_mode="hard". Default: (1.0, 3.0).
scale_rangeMultiplicative scale jitter applied on top of the
shrink-to-fit scale. Sampled uniformly from this range and capped at the fit scale,
so the result can shrink the donor further but never exceed fit-to-target.
Default: (1.0, 1.0) (pure shrink-to-fit, no jitter).
min_paste_areaMinimum scaled paste footprint area (pixels). Donors whose final
scaled H*W falls below this value are silently dropped — useful to avoid pasting
tiny blob-noise from huge donors onto small targets. Default: 1.
metadata_keyKey in the Compose call data dict containing the list of object dictionaries to paste. Default: "copy_paste_metadata".
pProbability of applying the transform. Default: 0.5.
>>> import numpy as np
>>> import albumentations as A
>>>
>>> # Primary data (target image is 100x100)
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> instance_masks = np.zeros((1, 100, 100), dtype=np.uint8)
>>> instance_masks[0, 10:30, 10:30] = 1
>>> bboxes = np.array([[10, 10, 30, 30]], dtype=np.float32)
>>> class_labels = [1]
>>>
>>> # Donor 1: tight 40x40 mask-based donor (any donor dims work).
>>> donor1_image = np.full((40, 40, 3), 200, dtype=np.uint8)
>>> donor1_mask = np.ones((40, 40), dtype=np.uint8)
>>>
>>> # Donor 2: bbox-only donor on a 60x80 image (rectangle paste footprint).
>>> donor2_image = np.random.randint(0, 256, (60, 80, 3), dtype=np.uint8)
>>>
>>> transform = A.Compose([
... A.CopyAndPaste(
... min_visibility_after_paste=0.05,
... scale_range=(0.5, 1.0), # randomly shrink donors to 50%-100% of fit
... p=1.0,
... ),
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['class_labels']))
>>>
>>> result = transform(
... image=image,
... masks=instance_masks,
... bboxes=bboxes,
... class_labels=class_labels,
... copy_paste_metadata=[
... {
... 'image': donor1_image,
... 'mask': donor1_mask,
... 'bbox_labels': {'class_labels': 2},
... },
... {
... 'image': donor2_image,
... 'bbox': [10, 5, 70, 55], # pascal_voc on 60x80 donor dims
... 'bbox_labels': {'class_labels': 3},
... },
... ],
... )
>>> result_image = result['image']
>>> result_masks = result['masks'] # (N_surviving + K, H, W)
>>> result_bboxes = result['bboxes'] # Updated bboxes (in pascal_voc, target dims)
>>> result_labels = result['class_labels'] # Updated labelsMost Copy-Paste implementations (e.g. detectron2) accept a single donor image with all
its instance masks and internally sample a random subset of instances to paste, coupling
donor selection, instance sampling, and pasting into one opaque step. This implementation
separates those concerns: donor selection and instance selection are done by the user
externally, and the transform pastes every object in the provided list. The metadata
format is list[dict] (one dict per object), consistent with Mosaic.