torch.istft¶

torch.
istft
(input: torch.Tensor, n_fft: int, hop_length: Optional[int] = None, win_length: Optional[int] = None, window: Optional[torch.Tensor] = None, center: bool = True, normalized: bool = False, onesided: Optional[bool] = None, length: Optional[int] = None, return_complex: bool = False) → torch.Tensor[source]¶ Inverse short time Fourier Transform. This is expected to be the inverse of
stft()
. It has the same parameters (+ additional optional parameter oflength
) and it should return the least squares estimation of the original signal. The algorithm will check using the NOLA condition ( nonzero overlap).Important consideration in the parameters
window
andcenter
so that the envelop created by the summation of all the windows is never zero at certain point in time. Specifically, $\sum_{t=\infty}^{\infty} w^2[nt\times hop\_length] \cancel{=} 0$ .Since
stft()
discards elements at the end of the signal if they do not fit in a frame,istft
may return a shorter signal than the original signal (can occur ifcenter
is False since the signal isn’t padded).If
center
isTrue
, then there will be padding e.g.'constant'
,'reflect'
, etc. Left padding can be trimmed off exactly because they can be calculated but right padding cannot be calculated without additional information.Example: Suppose the last window is:
[17, 18, 0, 0, 0]
vs[18, 0, 0, 0, 0]
The
n_fft
,hop_length
,win_length
are all the same which prevents the calculation of right padding. These additional values could be zeros or a reflection of the signal so providinglength
could be useful. Iflength
isNone
then padding will be aggressively removed (some loss of signal).[1] D. W. Griffin and J. S. Lim, “Signal estimation from modified shorttime Fourier transform,” IEEE Trans. ASSP, vol.32, no.2, pp.236243, Apr. 1984.
 Parameters
input (Tensor) – The input tensor. Expected to be output of
stft()
, can either be complex (channel
,fft_size
,n_frame
), or real (channel
,fft_size
,n_frame
, 2) where thechannel
dimension is optional.n_fft (int) – Size of Fourier transform
hop_length (Optional[int]) – The distance between neighboring sliding window frames. (Default:
n_fft // 4
)win_length (Optional[int]) – The size of window frame and STFT filter. (Default:
n_fft
)window (Optional[torch.Tensor]) – The optional window function. (Default:
torch.ones(win_length)
)center (bool) – Whether
input
was padded on both sides so that the $t$ th frame is centered at time $t \times \text{hop\_length}$ . (Default:True
)normalized (bool) – Whether the STFT was normalized. (Default:
False
)onesided (Optional[bool]) – Whether the STFT was onesided. (Default:
True
ifn_fft != fft_size
in the input size)length (Optional[int]) – The amount to trim the signal by (i.e. the original signal length). (Default: whole signal)
return_complex (Optional[bool]) – Whether the output should be complex, or if the input should be assumed to derive from a real signal and window. Note that this is incompatible with
onesided=True
. (Default:False
)
 Returns
Least squares estimation of the original signal of size (…, signal_length)
 Return type