class torchaudio.transforms.PitchShift(sample_rate: int, n_steps: int, bins_per_octave: int = 12, n_fft: int = 512, win_length: ~typing.Optional[int] = None, hop_length: ~typing.Optional[int] = None, window_fn: ~typing.Callable[[...], ~torch.Tensor] = <built-in method hann_window of type object>, wkwargs: ~typing.Optional[dict] = None)[source]

Shift the pitch of a waveform by n_steps steps.

This feature supports the following devices: CPU, CUDA This API supports the following properties: TorchScript
  • waveform (Tensor) – The input waveform of shape (…, time).

  • sample_rate (int) – Sample rate of waveform.

  • n_steps (int) – The (fractional) steps to shift waveform.

  • bins_per_octave (int, optional) – The number of steps per octave (Default : 12).

  • n_fft (int, optional) – Size of FFT, creates n_fft // 2 + 1 bins (Default: 512).

  • win_length (int or None, optional) – Window size. If None, then n_fft is used. (Default: None).

  • hop_length (int or None, optional) – Length of hop between STFT windows. If None, then win_length // 4 is used (Default: None).

  • window (Tensor or None, optional) – Window tensor that is applied/multiplied to each frame/window. If None, then torch.hann_window(win_length) is used (Default: None).

>>> waveform, sample_rate = torchaudio.load("test.wav", normalize=True)
>>> transform = transforms.PitchShift(sample_rate, 4)
>>> waveform_shift = transform(waveform)  # (channel, time)

Initialize parameters according to the input batch properties.

This adds an interface to isolate parameter initialization from the forward pass when doing parameter shape inference.

forward(waveform: Tensor) Tensor[source]

waveform (Tensor) – Tensor of audio of dimension (…, time).


The pitch-shifted audio of shape (…, time).

Return type:



Access comprehensive developer documentation for PyTorch

View Docs


Get in-depth tutorials for beginners and advanced developers

View Tutorials


Find development resources and get your questions answered

View Resources