torchaudio.backend¶
Overview¶
torchaudio.backend
module provides implementations for audio file I/O functionalities, which are torchaudio.info
, torchaudio.load
, torchaudio.load_wav
and torchaudio.save
.
There are currently four implementations available.
“sox” (deprecated, default on Linux/macOS)
“sox_io” (default on Linux/macOS from the 0.8.0 release)
“soundfile” - legacy interface (deprecated, default on Windows)
“soundfile” - new interface (default on Windows from the 0.8.0 release)
On Windows, only the "soundfile"
backend (with both interfaces) are available. It is recommended to use the new interface as the legacy interface is deprecated.
On Linux/macOS, please use "sox_io"
backend. The use of "sox"
backend is strongly discouraged as it cannot correctly handle formats other than 16-bit integer WAV. See #726 for the detail.
Note
Instead of calling functions in torchaudio.backend
directly, please use torchaudio.info
, torchaudio.load
, torchaudio.load_wav
and torchaudio.save
with proper backend set with torchaudio.set_audio_backend()
.
Availability¶
"sox"
and "sox_io"
backends require C++ extension module, which is included in Linux/macOS binary distributions. These backends are not available on Windows.
"soundfile"
backend requires SoundFile
. Please refer to the SoundFile documentation for the installation.
Changes in default backend and deprecation¶
Backend module is going through a major overhaul. The following table summarizes the timeline for the changes and deprecations.
Backend
0.7.0
0.8.0
0.9.0
"sox"
(deprecated)Default on Linux/macOS
Available
Removed
"sox_io"
Available
Default on Linx/macOS
Default on Linux/macOS
"soundfile"
(legacy interface, deprecated)Default on Windows
Available
Removed
"soundfile"
(new interface)Available
Default on Windows
Default on Windows
The default backend for Linux/macOS will be changed from
"sox"
to"sox_io"
in0.8.0
release.The
"sox"
backend will be removed in the0.9.0
release.Starting from the 0.8.0 release,
"soundfile"
backend will use the new interface, which has the same interface as"sox_io"
backend. The legacy interface will be removed in the0.9.0
release.
Common Data Structure¶
Structures used to report the metadata of audio files.
AudioMetaData¶
-
class
torchaudio.backend.common.
AudioMetaData
(sample_rate: int, num_frames: int, num_channels: int)[source]¶ Return type of
torchaudio.info
function.This class is used by “sox_io” backend and “soundfile” backend with the new interface.
SignalInfo (Deprecated)¶
-
class
torchaudio.backend.common.
SignalInfo
[source]¶ One of return types of
torchaudio.info
functions.This class is used by “sox” backend (deprecated) and “soundfile” backend with the legacy interface (deprecated).
See https://fossies.org/dox/sox-14.4.2/structsox__signalinfo__t.html
- Variables
channels (Optional[int]) – The number of channels
rate (Optional[float]) – Sampleing rate
precision (Optional[int]) – Bit depth
length (Optional[int]) – For sox backend, the number of samples. (frames * channels). For soundfile backend, the number of frames.
EncodingInfo (Deprecated)¶
-
class
torchaudio.backend.common.
EncodingInfo
[source]¶ One of return types of
torchaudio.info
functions.This class is used by “sox” backend (deprecated) and “soundfile” backend with the legacy interface (deprecated).
See https://fossies.org/dox/sox-14.4.2/structsox__encodinginfo__t.html
Sox Backend (Deprecated)¶
The "sox"
backend is available on Linux/macOS and not available on Windows. This backend is currently the default when available, but is deprecated and will be removed in 0.9.0
release.
You can switch from another backend to sox
backend with the following;
torchaudio.set_audio_backend("sox")
info¶
-
torchaudio.backend.sox_backend.
info
(filepath: str) → Tuple[torchaudio.backend.common.SignalInfo, torchaudio.backend.common.EncodingInfo][source]¶ Gets metadata from an audio file without loading the signal.
- Parameters
filepath – Path to audio file
- Returns
- A si (sox_signalinfo_t) signal
info as a python object. An ei (sox_encodinginfo_t) encoding info
- Return type
(sox_signalinfo_t, sox_encodinginfo_t)
- Example
>>> si, ei = torchaudio.info('foo.wav') >>> rate, channels, encoding = si.rate, si.channels, ei.encoding
load¶
-
torchaudio.backend.sox_backend.
load
(filepath: str, out: Optional[torch.Tensor] = None, normalization: bool = True, channels_first: bool = True, num_frames: int = 0, offset: int = 0, signalinfo: torchaudio.backend.common.SignalInfo = None, encodinginfo: torchaudio.backend.common.EncodingInfo = None, filetype: Optional[str] = None) → Tuple[torch.Tensor, int][source]¶ Loads an audio file from disk into a tensor
- Parameters
filepath – Path to audio file
out – An optional output tensor to use instead of creating one. (Default:
None
)normalization – Optional normalization. If boolean True, then output is divided by 1 << 31. Assuming the input is signed 32-bit audio, this normalizes to [-1, 1]. If float, then output is divided by that number. If Callable, then the output is passed as a paramete to the given function, then the output is divided by the result. (Default:
True
)channels_first – Set channels first or length first in result. (Default:
True
)num_frames – Number of frames to load. 0 to load everything after the offset. (Default:
0
)offset – Number of frames from the start of the file to begin data loading. (Default:
0
)signalinfo – A sox_signalinfo_t type, which could be helpful if the audio type cannot be automatically determined. (Default:
None
)encodinginfo – A sox_encodinginfo_t type, which could be set if the audio type cannot be automatically determined. (Default:
None
)filetype – A filetype or extension to be set if sox cannot determine it automatically. (Default:
None
)
- Returns
- An output tensor of size [C x L] or [L x C] where
L is the number of audio frames and C is the number of channels. An integer which is the sample rate of the audio (as listed in the metadata of the file)
- Return type
(Tensor, int)
- Example
>>> data, sample_rate = torchaudio.load('foo.mp3') >>> print(data.size()) torch.Size([2, 278756]) >>> print(sample_rate) 44100 >>> data_vol_normalized, _ = torchaudio.load('foo.mp3', normalization=lambda x: torch.abs(x).max()) >>> print(data_vol_normalized.abs().max()) 1.
-
torchaudio.backend.sox_backend.
load_wav
(filepath, **kwargs)[source]¶ Loads a wave file.
It assumes that the wav file uses 16 bit per sample that needs normalization by shifting the input right by 16 bits.
- Parameters
filepath – Path to audio file
- Returns
- An output tensor of size [C x L] or [L x C] where L is the number
of audio frames and C is the number of channels. An integer which is the sample rate of the audio (as listed in the metadata of the file)
- Return type
(Tensor, int)
save¶
-
torchaudio.backend.sox_backend.
save
(filepath: str, src: torch.Tensor, sample_rate: int, precision: int = 16, channels_first: bool = True) → None[source]¶ Saves a Tensor on file as an audio file
- Parameters
filepath – Path to audio file
src – An input 2D tensor of shape [C x L] or [L x C] where L is the number of audio frames, C is the number of channels
sample_rate – An integer which is the sample rate of the audio (as listed in the metadata of the file)
Bit precision (Default (precision) –
16
)channels_first (bool, optional) – Set channels first or length first in result. ( Default:
True
)
others¶
-
torchaudio.backend.sox_backend.
get_sox_bool
(i: int = 0) → Any[source]¶ Get enum of sox_bool for sox encodinginfo options.
- Parameters
i (int, optional) – Choose type or get a dict with all possible options use
__members__
to see all options when not specified. (Default:sox_false
or0
)- Returns
A sox_bool type
- Return type
sox_bool
-
torchaudio.backend.sox_backend.
get_sox_encoding_t
(i: int = None) → torchaudio.backend.common.EncodingInfo[source]¶ Get enum of sox_encoding_t for sox encodings.
- Parameters
i (int, optional) – Choose type or get a dict with all possible options use
__members__
to see all options when not specified. (Default:None
)- Returns
A sox_encoding_t type for output encoding
- Return type
sox_encoding_t
-
torchaudio.backend.sox_backend.
get_sox_option_t
(i: int = 2) → Any[source]¶ Get enum of sox_option_t for sox encodinginfo options.
- Parameters
i (int, optional) – Choose type or get a dict with all possible options use
__members__
to see all options when not specified. (Default:sox_option_default
or2
)- Returns
A sox_option_t type
- Return type
sox_option_t
-
torchaudio.backend.sox_backend.
save_encinfo
(filepath: str, src: torch.Tensor, channels_first: bool = True, signalinfo: Optional[torchaudio.backend.common.SignalInfo] = None, encodinginfo: Optional[torchaudio.backend.common.EncodingInfo] = None, filetype: Optional[str] = None) → None[source]¶ Saves a tensor of an audio signal to disk as a standard format like mp3, wav, etc.
- Parameters
filepath (str) – Path to audio file
src (Tensor) – An input 2D tensor of shape [C x L] or [L x C] where L is the number of audio frames, C is the number of channels
channels_first (bool, optional) – Set channels first or length first in result. (Default:
True
)signalinfo (sox_signalinfo_t, optional) – A sox_signalinfo_t type, which could be helpful if the audio type cannot be automatically determined (Default:
None
).encodinginfo (sox_encodinginfo_t, optional) – A sox_encodinginfo_t type, which could be set if the audio type cannot be automatically determined (Default:
None
).filetype (str, optional) – A filetype or extension to be set if sox cannot determine it automatically. (Default:
None
)
- Example
>>> data, sample_rate = torchaudio.load('foo.mp3') >>> torchaudio.save('foo.wav', data, sample_rate)
-
torchaudio.backend.sox_backend.
sox_encodinginfo_t
() → torchaudio.backend.common.EncodingInfo[source]¶ Create a sox_encodinginfo_t object. This object can be used to set the encoding type, bit precision, compression factor, reverse bytes, reverse nibbles, reverse bits and endianness. This can be used in an effects chain to encode the final output or to save a file with a specific encoding. For example, one could use the sox ulaw encoding to do 8-bit ulaw encoding. Note in a tensor output the result will be a 32-bit number, but number of unique values will be determined by the bit precision.
- Returns: sox_encodinginfo_t(object)
encoding (sox_encoding_t), output encoding
bits_per_sample (int), bit precision, same as precision in sox_signalinfo_t
compression (float), compression for lossy formats, 0.0 for default compression
reverse_bytes (sox_option_t), reverse bytes, use sox_option_default
reverse_nibbles (sox_option_t), reverse nibbles, use sox_option_default
reverse_bits (sox_option_t), reverse bytes, use sox_option_default
opposite_endian (sox_bool), change endianness, use sox_false
- Example
>>> ei = torchaudio.sox_encodinginfo_t() >>> ei.encoding = torchaudio.get_sox_encoding_t(1) >>> ei.bits_per_sample = 16 >>> ei.compression = 0 >>> ei.reverse_bytes = torchaudio.get_sox_option_t(2) >>> ei.reverse_nibbles = torchaudio.get_sox_option_t(2) >>> ei.reverse_bits = torchaudio.get_sox_option_t(2) >>> ei.opposite_endian = torchaudio.get_sox_bool(0)
-
torchaudio.backend.sox_backend.
sox_signalinfo_t
() → torchaudio.backend.common.SignalInfo[source]¶ Create a sox_signalinfo_t object. This object can be used to set the sample rate, number of channels, length, bit precision and headroom multiplier primarily for effects
- Returns: sox_signalinfo_t(object)
rate (float), sample rate as a float, practically will likely be an integer float
channel (int), number of audio channels
precision (int), bit precision
length (int), length of audio in samples * channels, 0 for unspecified and -1 for unknown
mult (float, optional), headroom multiplier for effects and
None
for no multiplier
- Example
>>> si = torchaudio.sox_signalinfo_t() >>> si.channels = 1 >>> si.rate = 16000. >>> si.precision = 16 >>> si.length = 0
Sox IO Backend¶
The "sox_io"
backend is available on Linux/macOS and not available on Windows. This backend is recommended over the "sox"
backend, and will become the default in the 0.8.0
release.
I/O functions of this backend support TorchScript.
You can switch from another backend to the sox_io
backend with the following;
torchaudio.set_audio_backend("sox_io")
info¶
-
torchaudio.backend.sox_io_backend.
info
(filepath: str) → torchaudio.backend.common.AudioMetaData[source]¶ Get signal information of an audio file.
- Parameters
filepath (str or pathlib.Path) – Path to audio file. This function also handles
pathlib.Path
objects, but is annotated asstr
for TorchScript compatibility.- Returns
Metadata of the given audio.
- Return type
load¶
-
torchaudio.backend.sox_io_backend.
load
(filepath: str, frame_offset: int = 0, num_frames: int = -1, normalize: bool = True, channels_first: bool = True) → Tuple[torch.Tensor, int][source]¶ Load audio data from file.
Note
This function can handle all the codecs that underlying libsox can handle, however it is tested on the following formats;
WAV
32-bit floating-point
32-bit signed integer
16-bit signed integer
8-bit unsigned integer
MP3
FLAC
OGG/VORBIS
OPUS
SPHERE
To load
MP3
,FLAC
,OGG/VORBIS
,OPUS
and other codecslibsox
does not handle natively, your installation oftorchaudio
has to be linked tolibsox
and corresponding codec libraries such aslibmad
orlibmp3lame
etc.By default (
normalize=True
,channels_first=True
), this function returns Tensor withfloat32
dtype and the shape of[channel, time]
. The samples are normalized to fit in the range of[-1.0, 1.0]
.When the input format is WAV with integer type, such as 32-bit signed integer, 16-bit signed integer and 8-bit unsigned integer (24-bit signed integer is not supported), by providing
normalize=False
, this function can return integer Tensor, where the samples are expressed within the whole range of the corresponding dtype, that is,int32
tensor for 32-bit signed PCM,int16
for 16-bit signed PCM anduint8
for 8-bit unsigned PCM.normalize
parameter has no effect on 32-bit floating-point WAV and other formats, such asflac
andmp3
. For these formats, this function always returnsfloat32
Tensor with values normalized to[-1.0, 1.0]
.- Parameters
filepath (str or pathlib.Path) – Path to audio file. This function also handles
pathlib.Path
objects, but is annotated asstr
for TorchScript compiler compatibility.frame_offset (int) – Number of frames to skip before start reading data.
num_frames (int) – Maximum number of frames to read.
-1
reads all the remaining samples, starting fromframe_offset
. This function may return the less number of frames if there is not enough frames in the given file.normalize (bool) – When
True
, this function always returnfloat32
, and sample values are normalized to[-1.0, 1.0]
. If input file is integer WAV, givingFalse
will change the resulting Tensor type to integer type. This argument has no effect for formats other than integer WAV type.channels_first (bool) – When True, the returned Tensor has dimension
[channel, time]
. Otherwise, the returned Tensor’s dimension is[time, channel]
.
- Returns
If the input file has integer wav format and normalization is off, then it has integer type, else
float32
type. Ifchannels_first=True
, it has[channel, time]
else[time, channel]
.- Return type
-
torchaudio.backend.sox_io_backend.
load_wav
(filepath: str, frame_offset: int = 0, num_frames: int = -1, channels_first: bool = True) → Tuple[torch.Tensor, int][source]¶ Load wave file.
This function is defined only for the purpose of compatibility against other backend for simple usecases, such as
torchaudio.load_wav(filepath)
. The implementation is same asload()
.
save¶
-
torchaudio.backend.sox_io_backend.
save
(filepath: str, src: torch.Tensor, sample_rate: int, channels_first: bool = True, compression: Optional[float] = None)[source]¶ Save audio data to file.
Note
Supported formats are;
WAV
32-bit floating-point
32-bit signed integer
16-bit signed integer
8-bit unsigned integer
MP3
FLAC
OGG/VORBIS
SPHERE
To save
MP3
,FLAC
,OGG/VORBIS
, and other codecslibsox
does not handle natively, your installation oftorchaudio
has to be linked tolibsox
and corresponding codec libraries such aslibmad
orlibmp3lame
etc.- Parameters
filepath (str or pathlib.Path) – Path to save file. This function also handles
pathlib.Path
objects, but is annotated asstr
for TorchScript compiler compatibility.tensor (torch.Tensor) – Audio data to save. must be 2D tensor.
sample_rate (int) – sampling rate
channels_first (bool) – If
True
, the given tensor is interpreted as[channel, time]
, otherwise[time, channel]
.compression (Optional[float]) –
Used for formats other than WAV. This corresponds to
-C
option ofsox
command.MP3
: Either bitrate (inkbps
) with quality factor, such as128.2
, orVBR encoding with quality factor such as-4.2
. Default:-4.5
.FLAC
: compression level. Whole number from0
to8
.8
is default and highest compression.OGG/VORBIS
: number from-1
to10
;-1
is the highest compressionand lowest quality. Default:3
.
See the detail at http://sox.sourceforge.net/soxformat.html.
Soundfile Backend¶
The "soundfile"
backend is available when SoundFile is installed. This backend is the default on Windows.
The "soundfile"
backend has two interfaces, legacy and new.
In the
0.7.0
release, the legacy interface is used by default when switching to the"soundfile"
backend.In the
0.8.0
release, the new interface will become the default.In the
0.9.0
release, the legacy interface will be removed.
To change the interface, set torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE
flag before switching the backend.
torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = True
torchaudio.set_audio_backend("soundfile") # The legacy interface
torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False
torchaudio.set_audio_backend("soundfile") # The new interface
Legacy Interface (Deprecated)¶
"soundfile"
backend with legacy interface is currently the default on Windows, however this interface is deprecated and will be removed in the 0.9.0
release.
To switch to this backend/interface, set torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE
flag before switching the backend.
torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = True
torchaudio.set_audio_backend("soundfile") # The legacy interface
info¶
-
torchaudio.backend.soundfile_backend.
info
(filepath: str) → Tuple[torchaudio.backend.common.SignalInfo, torchaudio.backend.common.EncodingInfo][source]¶ Gets metadata from an audio file without loading the signal.
- Parameters
filepath – Path to audio file
- Returns
- A si (sox_signalinfo_t) signal
info as a python object. An ei (sox_encodinginfo_t) encoding info
- Return type
(sox_signalinfo_t, sox_encodinginfo_t)
- Example
>>> si, ei = torchaudio.info('foo.wav') >>> rate, channels, encoding = si.rate, si.channels, ei.encoding
load¶
-
torchaudio.backend.soundfile_backend.
load
(filepath: str, out: Optional[torch.Tensor] = None, normalization: Optional[bool] = True, channels_first: Optional[bool] = True, num_frames: int = 0, offset: int = 0, signalinfo: torchaudio.backend.common.SignalInfo = None, encodinginfo: torchaudio.backend.common.EncodingInfo = None, filetype: Optional[str] = None) → Tuple[torch.Tensor, int][source]¶ Loads an audio file from disk into a tensor
- Parameters
filepath – Path to audio file
out – An optional output tensor to use instead of creating one. (Default:
None
)normalization – Optional normalization. If boolean True, then output is divided by 1 << 31. Assuming the input is signed 32-bit audio, this normalizes to [-1, 1]. If float, then output is divided by that number. If Callable, then the output is passed as a paramete to the given function, then the output is divided by the result. (Default:
True
)channels_first – Set channels first or length first in result. (Default:
True
)num_frames – Number of frames to load. 0 to load everything after the offset. (Default:
0
)offset – Number of frames from the start of the file to begin data loading. (Default:
0
)signalinfo – A sox_signalinfo_t type, which could be helpful if the audio type cannot be automatically determined. (Default:
None
)encodinginfo – A sox_encodinginfo_t type, which could be set if the audio type cannot be automatically determined. (Default:
None
)filetype – A filetype or extension to be set if sox cannot determine it automatically. (Default:
None
)
- Returns
- An output tensor of size [C x L] or [L x C] where
L is the number of audio frames and C is the number of channels. An integer which is the sample rate of the audio (as listed in the metadata of the file)
- Return type
(Tensor, int)
- Example
>>> data, sample_rate = torchaudio.load('foo.mp3') >>> print(data.size()) torch.Size([2, 278756]) >>> print(sample_rate) 44100 >>> data_vol_normalized, _ = torchaudio.load('foo.mp3', normalization=lambda x: torch.abs(x).max()) >>> print(data_vol_normalized.abs().max()) 1.
-
torchaudio.backend.soundfile_backend.
load_wav
(filepath, **kwargs)[source]¶ Loads a wave file.
It assumes that the wav file uses 16 bit per sample that needs normalization by shifting the input right by 16 bits.
- Parameters
filepath – Path to audio file
- Returns
- An output tensor of size [C x L] or [L x C] where L is the number
of audio frames and C is the number of channels. An integer which is the sample rate of the audio (as listed in the metadata of the file)
- Return type
(Tensor, int)
save¶
-
torchaudio.backend.soundfile_backend.
save
(filepath: str, src: torch.Tensor, sample_rate: int, precision: int = 16, channels_first: bool = True) → None[source]¶ Saves a Tensor on file as an audio file
- Parameters
filepath – Path to audio file
src – An input 2D tensor of shape [C x L] or [L x C] where L is the number of audio frames, C is the number of channels
sample_rate – An integer which is the sample rate of the audio (as listed in the metadata of the file)
Bit precision (Default (precision) –
16
)channels_first (bool, optional) – Set channels first or length first in result. ( Default:
True
)
New Interface¶
The "soundfile"
backend with new interface will become the default in the 0.8.0
release.
To switch to this backend/interface, set torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE
flag before switching the backend.
torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False
torchaudio.set_audio_backend("soundfile") # The new interface
info¶
-
torchaudio.backend._soundfile_backend.
info
(filepath: str) → torchaudio.backend.common.AudioMetaData[source]¶ Get signal information of an audio file.
- Parameters
filepath (str or pathlib.Path) – Path to audio file. This functionalso handles
pathlib.Path
objects, but is annotated asstr
for the consistency with “sox_io” backend, which has a restriction on type annotation for TorchScript compiler compatiblity.- Returns
meta data of the given audio.
- Return type
load¶
-
torchaudio.backend._soundfile_backend.
load
(filepath: str, frame_offset: int = 0, num_frames: int = -1, normalize: bool = True, channels_first: bool = True) → Tuple[torch.Tensor, int][source]¶ Load audio data from file.
Note
The formats this function can handle depend on the soundfile installation. This function is tested on the following formats;
WAV
32-bit floating-point
32-bit signed integer
16-bit signed integer
8-bit unsigned integer
FLAC
OGG/VORBIS
SPHERE
By default (
normalize=True
,channels_first=True
), this function returns Tensor withfloat32
dtype and the shape of[channel, time]
. The samples are normalized to fit in the range of[-1.0, 1.0]
.When the input format is WAV with integer type, such as 32-bit signed integer, 16-bit signed integer and 8-bit unsigned integer (24-bit signed integer is not supported), by providing
normalize=False
, this function can return integer Tensor, where the samples are expressed within the whole range of the corresponding dtype, that is,int32
tensor for 32-bit signed PCM,int16
for 16-bit signed PCM anduint8
for 8-bit unsigned PCM.normalize
parameter has no effect on 32-bit floating-point WAV and other formats, such asflac
andmp3
. For these formats, this function always returnsfloat32
Tensor with values normalized to[-1.0, 1.0]
.- Parameters
filepath (str or pathlib.Path) – Path to audio file. This functionalso handles
pathlib.Path
objects, but is annotated asstr
for the consistency with “sox_io” backend, which has a restriction on type annotation for TorchScript compiler compatiblity.frame_offset (int) – Number of frames to skip before start reading data.
num_frames (int) – Maximum number of frames to read.
-1
reads all the remaining samples, starting fromframe_offset
. This function may return the less number of frames if there is not enough frames in the given file.normalize (bool) – When
True
, this function always returnfloat32
, and sample values are normalized to[-1.0, 1.0]
. If input file is integer WAV, givingFalse
will change the resulting Tensor type to integer type. This argument has no effect for formats other than integer WAV type.channels_first (bool) – When True, the returned Tensor has dimension
[channel, time]
. Otherwise, the returned Tensor’s dimension is[time, channel]
.
- Returns
If the input file has integer wav format and normalization is off, then it has integer type, else
float32
type. Ifchannels_first=True
, it has[channel, time]
else[time, channel]
.- Return type
-
torchaudio.backend._soundfile_backend.
load_wav
(filepath: str, frame_offset: int = 0, num_frames: int = -1, channels_first: bool = True) → Tuple[torch.Tensor, int][source]¶ Load wave file.
This function is defined only for the purpose of compatibility against other backend for simple usecases, such as
torchaudio.load_wav(filepath)
. The implementation is same asload()
.
save¶
-
torchaudio.backend._soundfile_backend.
save
(filepath: str, src: torch.Tensor, sample_rate: int, channels_first: bool = True, compression: Optional[float] = None)[source]¶ Save audio data to file.
Note
The formats this function can handle depend on the soundfile installation. This function is tested on the following formats;
WAV
32-bit floating-point
32-bit signed integer
16-bit signed integer
8-bit unsigned integer
FLAC
OGG/VORBIS
SPHERE
- Parameters
filepath (str or pathlib.Path) – Path to audio file. This functionalso handles
pathlib.Path
objects, but is annotated asstr
for the consistency with “sox_io” backend, which has a restriction on type annotation for TorchScript compiler compatiblity.tensor (torch.Tensor) – Audio data to save. must be 2D tensor.
sample_rate (int) – sampling rate
channels_first (bool) – If
True
, the given tensor is interpreted as[channel, time]
, otherwise[time, channel]
.compression (Optional[float]) – Not used. It is here only for interface compatibility reson with “sox_io” backend.