Shortcuts

torchaudio.functional.mvdr_weights_rtf

torchaudio.functional.mvdr_weights_rtf(rtf: Tensor, psd_n: Tensor, reference_channel: Optional[Union[int, Tensor]] = None, diagonal_loading: bool = True, diag_eps: float = 1e-07, eps: float = 1e-08) Tensor[source]

Compute the Minimum Variance Distortionless Response (MVDR [Capon, 1969]) beamforming weights based on the relative transfer function (RTF) and power spectral density (PSD) matrix of noise.

This feature supports the following devices: CPU, CUDA This API supports the following properties: Autograd, TorchScript

Given the relative transfer function (RTF) matrix or the steering vector of target speech v\bm{v}, the PSD matrix of noise ΦNN\bf{\Phi}_{\textbf{NN}}, and a one-hot vector that represents the reference channel u\bf{u}, the method computes the MVDR beamforming weight martrix wMVDR\textbf{w}_{\text{MVDR}}. The formula is defined as:

wMVDR(f)=ΦNN1(f)v(f)vH(f)ΦNN1(f)v(f)\textbf{w}_{\text{MVDR}}(f) = \frac{{{\bf{\Phi}_{\textbf{NN}}^{-1}}(f){\bm{v}}(f)}} {{\bm{v}^{\mathsf{H}}}(f){\bf{\Phi}_{\textbf{NN}}^{-1}}(f){\bm{v}}(f)}

where (.)H(.)^{\mathsf{H}} denotes the Hermitian Conjugate operation.

Parameters:
  • rtf (torch.Tensor) – The complex-valued RTF vector of target speech. Tensor with dimensions (…, freq, channel).

  • psd_n (torch.Tensor) – The complex-valued power spectral density (PSD) matrix of noise. Tensor with dimensions (…, freq, channel, channel).

  • reference_channel (int or torch.Tensor) – Specifies the reference channel. If the dtype is int, it represents the reference channel index. If the dtype is torch.Tensor, its shape is (…, channel), where the channel dimension is one-hot.

  • diagonal_loading (bool, optional) – If True, enables applying diagonal loading to psd_n. (Default: True)

  • diag_eps (float, optional) – The coefficient multiplied to the identity matrix for diagonal loading. It is only effective when diagonal_loading is set to True. (Default: 1e-7)

  • eps (float, optional) – Value to add to the denominator in the beamforming weight formula. (Default: 1e-8)

Returns:

The complex-valued MVDR beamforming weight matrix with dimensions (…, freq, channel).

Return type:

torch.Tensor

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources