torchaudio.functional.mvdr_weights_souden
- torchaudio.functional.mvdr_weights_souden(psd_s: Tensor, psd_n: Tensor, reference_channel: Union[int, Tensor], diagonal_loading: bool = True, diag_eps: float = 1e-07, eps: float = 1e-08) Tensor [source]
Compute the Minimum Variance Distortionless Response (MVDR [Capon, 1969]) beamforming weights by the method proposed by Souden et, al. [Souden et al., 2009].
Given the power spectral density (PSD) matrix of target speech , the PSD matrix of noise , and a one-hot vector that represents the reference channel , the method computes the MVDR beamforming weight martrix . The formula is defined as:
- Parameters:
psd_s (torch.Tensor) – The complex-valued power spectral density (PSD) matrix of target speech. Tensor with dimensions (…, freq, channel, channel).
psd_n (torch.Tensor) – The complex-valued power spectral density (PSD) matrix of noise. Tensor with dimensions (…, freq, channel, channel).
reference_channel (int or torch.Tensor) – Specifies the reference channel. If the dtype is
int
, it represents the reference channel index. If the dtype istorch.Tensor
, its shape is (…, channel), where thechannel
dimension is one-hot.diagonal_loading (bool, optional) – If
True
, enables applying diagonal loading topsd_n
. (Default:True
)diag_eps (float, optional) – The coefficient multiplied to the identity matrix for diagonal loading. It is only effective when
diagonal_loading
is set toTrue
. (Default:1e-7
)eps (float, optional) – Value to add to the denominator in the beamforming weight formula. (Default:
1e-8
)
- Returns:
The complex-valued MVDR beamforming weight matrix with dimensions (…, freq, channel).
- Return type: