torch.istft(input: torch.Tensor, n_fft: int, hop_length: Optional[int] = None, win_length: Optional[int] = None, window: Optional[torch.Tensor] = None, center: bool = True, normalized: bool = False, onesided: bool = True, length: Optional[int] = None) → torch.Tensor[source]

Inverse short time Fourier Transform. This is expected to be the inverse of stft(). It has the same parameters (+ additional optional parameter of length) and it should return the least squares estimation of the original signal. The algorithm will check using the NOLA condition ( nonzero overlap).

Important consideration in the parameters window and center so that the envelop created by the summation of all the windows is never zero at certain point in time. Specifically, t=w2[nt×hop_length]=0\sum_{t=-\infty}^{\infty} w^2[n-t\times hop\_length] \cancel{=} 0 .

Since stft() discards elements at the end of the signal if they do not fit in a frame, istft may return a shorter signal than the original signal (can occur if center is False since the signal isn’t padded).

If center is True, then there will be padding e.g. 'constant', 'reflect', etc. Left padding can be trimmed off exactly because they can be calculated but right padding cannot be calculated without additional information.

Example: Suppose the last window is: [17, 18, 0, 0, 0] vs [18, 0, 0, 0, 0]

The n_fft, hop_length, win_length are all the same which prevents the calculation of right padding. These additional values could be zeros or a reflection of the signal so providing length could be useful. If length is None then padding will be aggressively removed (some loss of signal).

[1] D. W. Griffin and J. S. Lim, “Signal estimation from modified short-time Fourier transform,” IEEE Trans. ASSP, vol.32, no.2, pp.236-243, Apr. 1984.

  • input (Tensor) – The input tensor. Expected to be output of stft(), either 3D (fft_size, n_frame, 2) or 4D (channel, fft_size, n_frame, 2).

  • n_fft (int) – Size of Fourier transform

  • hop_length (Optional[int]) – The distance between neighboring sliding window frames. (Default: n_fft // 4)

  • win_length (Optional[int]) – The size of window frame and STFT filter. (Default: n_fft)

  • window (Optional[torch.Tensor]) – The optional window function. (Default: torch.ones(win_length))

  • center (bool) – Whether input was padded on both sides so that the tt -th frame is centered at time t×hop_lengtht \times \text{hop\_length} . (Default: True)

  • normalized (bool) – Whether the STFT was normalized. (Default: False)

  • onesided (bool) – Whether the STFT is onesided. (Default: True)

  • length (Optional[int]) – The amount to trim the signal by (i.e. the original signal length). (Default: whole signal)


Least squares estimation of the original signal of size (…, signal_length)

Return type



Access comprehensive developer documentation for PyTorch

View Docs


Get in-depth tutorials for beginners and advanced developers

View Tutorials


Find development resources and get your questions answered

View Resources