TimeMasking

Targets:

image

mask

bboxes

keypoints

Image Types:uint8, float32

Mask spectrogram in time domain. time_mask_param sets max mask length; SpecAugment-style. Single horizontal mask; use XYMasking for more flexibility.

This transform masks random segments along the time axis of a spectrogram, implementing the time masking technique proposed in the SpecAugment paper. Time masking helps in training models to be robust against temporal variations and missing information in audio signals.

This is a specialized version of XYMasking configured for time masking only. For more advanced use cases (e.g., multiple masks, frequency masking, or custom fill values), consider using XYMasking directly.

Arguments

time_mask_param

int

Maximum possible length of the mask in the time domain. Must be a positive integer. Length of the mask is uniformly sampled from (0, time_mask_param).

p

float

0.5

probability of applying the transform. Default: 0.5.

References

SpecAugment paperhttps://arxiv.org/abs/1904.08779
Original implementationhttps://pytorch.org/audio/stable/transforms.html#timemask