HiFiGANVocoder¶

class torchaudio.prototype.models.HiFiGANVocoder(in_channels: int, upsample_rates: Tuple[int, ...], upsample_initial_channel: int, upsample_kernel_sizes: Tuple[int, ...], resblock_kernel_sizes: Tuple[int, ...], resblock_dilation_sizes: Tuple[Tuple[int, ...], ...], resblock_type: int, lrelu_slope: float)[source]¶

Generator part of HiFi GAN [Kong et al., 2020]. Source: https://github.com/jik876/hifi-gan/blob/4769534d45265d52a904b850da5a622601885777/models.py#L75

Note

To build the model, please use one of the factory functions: hifigan_vocoder(), hifigan_vocoder_v1(), hifigan_vocoder_v2(), hifigan_vocoder_v3().

Parameters:

in_channels (int) – Number of channels in the input features.
upsample_rates (tuple of int) – Factors by which each upsampling layer increases the time dimension.
upsample_initial_channel (int) – Number of channels in the input feature tensor.
upsample_kernel_sizes (tuple of int) – Kernel size for each upsampling layer.
resblock_kernel_sizes (tuple of int) – Kernel size for each residual block.
resblock_dilation_sizes (tuple of tuples of int) – Dilation sizes for each 1D convolutional layer in each residual block. For resblock type 1 inner tuples should have length 3, because there are 3 convolutions in each layer. For resblock type 2 they should have length 2.
resblock_type (int, 1 or 2) – Determines whether ResBlock1 or ResBlock2 will be used.
lrelu_slope (float) – Slope of leaky ReLUs in activations.

Methods¶

forward¶

HiFiGANVocoder.forward(x: Tensor) → Tensor[source]¶

Parameters:: x (Tensor) – Feature input tensor of shape (batch_size, num_channels, time_length).
Returns:: Tensor of shape (batch_size, 1, time_length * upsample_rate), where upsample_rate is the product of upsample rates for all layers.

Factory Functions¶

`hifigan_vocoder`	Builds HiFi GAN Vocoder [Kong et al., 2020].
`hifigan_vocoder_v1`	Builds HiFiGAN Vocoder with V1 architecture [Kong et al., 2020].
`hifigan_vocoder_v2`	Builds HiFiGAN Vocoder with V2 architecture [Kong et al., 2020].
`hifigan_vocoder_v3`	Builds HiFiGAN Vocoder with V3 architecture [Kong et al., 2020].

HiFiGANVocoder¶

Methods¶

forward¶

Factory Functions¶

Docs

Tutorials

Resources