FrequencyMasking

Targets:
image
mask
bboxes
keypoints
volume
mask3d
Image Types:uint8, float32

Apply masking to a spectrogram in the frequency domain.

This transform masks random segments along the frequency axis of a spectrogram, implementing the frequency masking technique proposed in the SpecAugment paper. Frequency masking helps in training models to be robust against frequency variations and missing spectral information in audio signals.

This is a specialized version of XYMasking configured for frequency masking only. For more advanced use cases (e.g., multiple masks, time masking, or custom fill values), consider using XYMasking directly.

Arguments
freq_mask_param
int
30

Maximum possible length of the mask in the frequency domain. Must be a positive integer. Length of the mask is uniformly sampled from (0, freq_mask_param).

p
float
0.5

probability of applying the transform. Default: 0.5.