Shortcuts

PSD

class torchaudio.transforms.PSD(multi_mask: bool = False, normalize: bool = True, eps: float = 1e-15)[source]

Compute cross-channel power spectral density (PSD) matrix.

This feature supports the following devices: CPU, CUDA This API supports the following properties: Autograd, TorchScript
Parameters:
  • multi_mask (bool, optional) – If True, only accepts multi-channel Time-Frequency masks. (Default: False)

  • normalize (bool, optional) – If True, normalize the mask along the time dimension. (Default: True)

  • eps (float, optional) – Value to add to the denominator in mask normalization. (Default: 1e-15)

Tutorials using PSD:
Speech Enhancement with MVDR Beamforming

Speech Enhancement with MVDR Beamforming

Speech Enhancement with MVDR Beamforming
forward(specgram: Tensor, mask: Optional[Tensor] = None)[source]
Parameters:
  • specgram (torch.Tensor) – Multi-channel complex-valued spectrum. Tensor with dimensions (…, channel, freq, time).

  • mask (torch.Tensor or None, optional) – Time-Frequency mask for normalization. Tensor with dimensions (…, freq, time) if multi_mask is False or with dimensions (…, channel, freq, time) if multi_mask is True. (Default: None)

Returns:

The complex-valued PSD matrix of the input spectrum.

Tensor with dimensions (…, freq, channel, channel)

Return type:

torch.Tensor

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources