torchaudio.prototype.functional.simulate_rir_ism(room: Tensor, source: Tensor, mic_array: Tensor, max_order: int, absorption: Union[float, Tensor], output_length: Optional[int] = None, delay_filter_length: int = 81, center_frequency: Optional[Tensor] = None, sound_speed: float = 343.0, sample_rate: float = 16000.0) Tensor[source]

Compute Room Impulse Response (RIR) based on the image source method [Allen and Berkley, 1979]. The implementation is based on pyroomacoustics [Scheibler et al., 2018].

This feature supports the following devices: CPU This API supports the following properties: TorchScript
  • room (torch.Tensor) – Room coordinates. The shape of room must be (3,) which represents three dimensions of the room.

  • source (torch.Tensor) – Sound source coordinates. Tensor with dimensions (3,).

  • mic_array (torch.Tensor) – Microphone coordinates. Tensor with dimensions (channel, 3).

  • max_order (int) – The maximum number of reflections of the source.

  • absorption (float or torch.Tensor) – The absorption [Wikipedia contributors, n.d.] coefficients of wall materials for sound energy. If the dtype is float, the absorption coefficient is identical for all walls and all frequencies. If absorption is a 1D Tensor, the shape must be (6,), where the values represent absorption coefficients of "west", "east", "south", "north", "floor", and "ceiling", respectively. If absorption is a 2D Tensor, the shape must be (7, 6), where 7 represents the number of octave bands.

  • output_length (int or None, optional) –

    The output length of simulated RIR signal. If None, the length is defined as

    \[\frac{\text{max\_d} \cdot \text{sample\_rate}}{\text{sound\_speed}} + \text{delay\_filter\_length} \]

    where max_d is the maximum distance between image sources and microphones.

  • delay_filter_length (int, optional) – The filter length for computing sinc function. (Default: 81)

  • center_frequency (torch.Tensor, optional) – The center frequencies of octave bands for multi-band walls. Only used when absorption is a 2D Tensor.

  • sound_speed (float, optional) – The speed of sound. (Default: 343.0)

  • sample_rate (float, optional) – The sample rate of the generated room impulse response signal. (Default: 16000.0)


The simulated room impulse response waveform. Tensor with dimensions (channel, rir_length).

Return type:



If absorption is a 2D Tensor and center_frequency is set to None, the center frequencies of octave bands are fixed to [125.0, 250.0, 500.0, 1000.0, 2000.0, 4000.0, 8000.0]. Users need to tune the values of absorption to the corresponding frequencies.


