torch.stft¶

torch.
stft
(input: torch.Tensor, n_fft: int, hop_length: Optional[int] = None, win_length: Optional[int] = None, window: Optional[torch.Tensor] = None, center: bool = True, pad_mode: str = 'reflect', normalized: bool = False, onesided: bool = True) → torch.Tensor[source]¶ Shorttime Fourier transform (STFT).
Ignoring the optional batch dimension, this method computes the following expression:
where $m$ is the index of the sliding window, and $\omega$ is the frequency that $0 \leq \omega < \text{n\_fft}$ . When
onesided
is the default valueTrue
,input
must be either a 1D time sequence or a 2D batch of time sequences.If
hop_length
isNone
(default), it is treated as equal tofloor(n_fft / 4)
.If
win_length
isNone
(default), it is treated as equal ton_fft
.window
can be a 1D tensor of sizewin_length
, e.g., fromtorch.hann_window()
. Ifwindow
isNone
(default), it is treated as if having $1$ everywhere in the window. If $\text{win\_length} < \text{n\_fft}$ ,window
will be padded on both sides to lengthn_fft
before being applied.If
center
isTrue
(default),input
will be padded on both sides so that the $t$ th frame is centered at time $t \times \text{hop\_length}$ . Otherwise, the $t$ th frame begins at time $t \times \text{hop\_length}$ .pad_mode
determines the padding method used oninput
whencenter
isTrue
. Seetorch.nn.functional.pad()
for all available options. Default is"reflect"
.If
onesided
isTrue
(default), only values for $\omega$ in are returned because the realtocomplex Fourier transform satisfies the conjugate symmetry, i.e., $X[m, \omega] = X[m, \text{n\_fft}  \omega]^*$ .If
normalized
isTrue
(default isFalse
), the function returns the normalized STFT results, i.e., multiplied by $(\text{frame\_length})^{0.5}$ .
Returns the real and the imaginary parts together as one tensor of size $(* \times N \times T \times 2)$ , where $*$ is the optional batch size of
input
, $N$ is the number of frequencies where STFT is applied, $T$ is the total number of frames used, and each pair in the last dimension represents a complex number as the real part and the imaginary part.Warning
This function changed signature at version 0.4.1. Calling with the previous signature may cause error or return incorrect result.
 Parameters
input (Tensor) – the input tensor
n_fft (int) – size of Fourier transform
hop_length (int, optional) – the distance between neighboring sliding window frames. Default:
None
(treated as equal tofloor(n_fft / 4)
)win_length (int, optional) – the size of window frame and STFT filter. Default:
None
(treated as equal ton_fft
)window (Tensor, optional) – the optional window function. Default:
None
(treated as window of all $1$ s)center (bool, optional) – whether to pad
input
on both sides so that the $t$ th frame is centered at time $t \times \text{hop\_length}$ . Default:True
pad_mode (string, optional) – controls the padding method used when
center
isTrue
. Default:"reflect"
normalized (bool, optional) – controls whether to return the normalized STFT results Default:
False
onesided (bool, optional) – controls whether to return half of results to avoid redundancy Default:
True
 Returns
A tensor containing the STFT result with shape described above
 Return type