PSD

class torchaudio.transforms.PSD(multi_mask: bool = False, normalize: bool = True, eps: float = 1e-15)[source]

Compute cross-channel power spectral density (PSD) matrix.

Parameters:

multi_mask (bool, optional) – If True, only accepts multi-channel Time-Frequency masks. (Default: False)
normalize (bool, optional) – If True, normalize the mask along the time dimension. (Default: True)
eps (float, optional) – Value to add to the denominator in mask normalization. (Default: 1e-15)

Tutorials using PSD:: Speech Enhancement with MVDR Beamforming

Speech Enhancement with MVDR Beamforming

forward(specgram: Tensor, mask: Optional[Tensor] = None)[source]

Parameters:

specgram (torch.Tensor) – Multi-channel complex-valued spectrum. Tensor with dimensions (…, channel, freq, time).
mask (torch.Tensor or None, optional) – Time-Frequency mask for normalization. Tensor with dimensions (…, freq, time) if multi_mask is False or with dimensions (…, channel, freq, time) if multi_mask is True. (Default: None)

Returns:

The complex-valued PSD matrix of the input spectrum.: Tensor with dimensions (…, freq, channel, channel)

Return type:

torch.Tensor

Docs