torchaudio.prototype.functional.simulate_rir_ism¶
- torchaudio.prototype.functional.simulate_rir_ism(room: Tensor, source: Tensor, mic_array: Tensor, max_order: int, absorption: Union[float, Tensor], output_length: Optional[int] = None, delay_filter_length: int = 81, center_frequency: Optional[Tensor] = None, sound_speed: float = 343.0, sample_rate: float = 16000.0) Tensor [source]¶
Compute Room Impulse Response (RIR) based on the image source method [Allen and Berkley, 1979]. The implementation is based on pyroomacoustics [Scheibler et al., 2018].
- Parameters:
room (torch.Tensor) – Room coordinates. The shape of room must be (3,) which represents three dimensions of the room.
source (torch.Tensor) – Sound source coordinates. Tensor with dimensions (3,).
mic_array (torch.Tensor) – Microphone coordinates. Tensor with dimensions (channel, 3).
max_order (int) – The maximum number of reflections of the source.
absorption (float or torch.Tensor) – The absorption [Wikipedia contributors, n.d.] coefficients of wall materials for sound energy. If the dtype is
float
, the absorption coefficient is identical for all walls and all frequencies. Ifabsorption
is a 1D Tensor, the shape must be (6,), where the values represent absorption coefficients of"west"
,"east"
,"south"
,"north"
,"floor"
, and"ceiling"
, respectively. Ifabsorption
is a 2D Tensor, the shape must be (7, 6), where 7 represents the number of octave bands.output_length (int or None, optional) –
The output length of simulated RIR signal. If
None
, the length is defined as\[\frac{\text{max\_d} \cdot \text{sample\_rate}}{\text{sound\_speed}} + \text{delay\_filter\_length} \]where
max_d
is the maximum distance between image sources and microphones.delay_filter_length (int, optional) – The filter length for computing sinc function. (Default:
81
)center_frequency (torch.Tensor, optional) – The center frequencies of octave bands for multi-band walls. Only used when
absorption
is a 2D Tensor.sound_speed (float, optional) – The speed of sound. (Default:
343.0
)sample_rate (float, optional) – The sample rate of the generated room impulse response signal. (Default:
16000.0
)
- Returns:
The simulated room impulse response waveform. Tensor with dimensions (channel, rir_length).
- Return type:
Note
If
absorption
is a 2D Tensor andcenter_frequency
is set toNone
, the center frequencies of octave bands are fixed to[125.0, 250.0, 500.0, 1000.0, 2000.0, 4000.0, 8000.0]
. Users need to tune the values ofabsorption
to the corresponding frequencies.