HiFiGANVocoder¶
- class torchaudio.prototype.models.HiFiGANVocoder(in_channels: int, upsample_rates: Tuple[int, ...], upsample_initial_channel: int, upsample_kernel_sizes: Tuple[int, ...], resblock_kernel_sizes: Tuple[int, ...], resblock_dilation_sizes: Tuple[Tuple[int, ...], ...], resblock_type: int, lrelu_slope: float)[source]¶
Generator part of HiFi GAN [Kong et al., 2020]. Source: https://github.com/jik876/hifi-gan/blob/4769534d45265d52a904b850da5a622601885777/models.py#L75
Note
To build the model, please use one of the factory functions:
hifigan_vocoder()
,hifigan_vocoder_v1()
,hifigan_vocoder_v2()
,hifigan_vocoder_v3()
.- Parameters:
in_channels (int) – Number of channels in the input features.
upsample_rates (tuple of
int
) – Factors by which each upsampling layer increases the time dimension.upsample_initial_channel (int) – Number of channels in the input feature tensor.
upsample_kernel_sizes (tuple of
int
) – Kernel size for each upsampling layer.resblock_kernel_sizes (tuple of
int
) – Kernel size for each residual block.resblock_dilation_sizes (tuple of tuples of
int
) – Dilation sizes for each 1D convolutional layer in each residual block. For resblock type 1 inner tuples should have length 3, because there are 3 convolutions in each layer. For resblock type 2 they should have length 2.resblock_type (int, 1 or 2) – Determines whether
ResBlock1
orResBlock2
will be used.lrelu_slope (float) – Slope of leaky ReLUs in activations.
Methods¶
forward¶
Factory Functions¶
Builds HiFi GAN Vocoder [Kong et al., 2020]. |
|
Builds HiFiGAN Vocoder with V1 architecture [Kong et al., 2020]. |
|
Builds HiFiGAN Vocoder with V2 architecture [Kong et al., 2020]. |
|
Builds HiFiGAN Vocoder with V3 architecture [Kong et al., 2020]. |