Note

Click here to download the full example code

StreamWriter Advanced Usage

Author: Moto Hira

This tutorial shows how to use torchaudio.io.StreamWriter to play audio and video.

Note

This tutorial uses hardware devices, thus it is not portable across different operating systems.

The tutorial was written and tested on MacBook Pro (M1, 2020).

Note

This tutorial requires torchaudio nightly build and FFmpeg libraries (>=4.1, <4.4).

To install torchaudio nightly build, please refer to https://pytorch.org/get-started/locally/ .

There are multiple ways to install FFmpeg libraries. If you are using Anaconda Python distribution, conda install 'ffmpeg<4.4' will install the required FFmpeg libraries, however, this distribution does not have SDL plugin, so it cannot play video.

Warning

TorchAudio dynamically loads compatible FFmpeg libraries installed on the system. The types of supported formats (media format, encoder, encoder options etc) depend on the libraries.

To check the available devices, muxers and encoders, you can use the following commands

ffmpeg -muxers
ffmpeg -encoders
ffmpeg -devices
ffmpeg -protocols

Preparation

import torch
import torchaudio

print(torch.__version__)
print(torchaudio.__version__)

from torchaudio.io import StreamWriter

from torchaudio.utils import download_asset

AUDIO_PATH = download_asset("tutorial-assets/Lab41-SRI-VOiCES-src-sp0307-ch127535-sg0042.wav")
VIDEO_PATH = download_asset("tutorial-assets/stream-api/NASAs_Most_Scientifically_Complex_Space_Observatory_Requires_Precision-MP4_small.mp4")

Device Availability

StreamWriter takes advantage of FFmpeg’s IO abstraction and writes the data to media devices such as speakers and GUI.

To write to devices, provide format option to the constructor of StreamWriter.

Different OS will have different device options and their availabilities depend on the actual installation of FFmpeg.

To check which device is available, you can use ffmpeg -devices command.

“audiotoolbox” (speaker) and “sdl” (video GUI) are available.

$ ffmpeg -devices
...
Devices:
 D. = Demuxing supported
 .E = Muxing supported
 --
  E audiotoolbox    AudioToolbox output device
 D  avfoundation    AVFoundation input device
 D  lavfi           Libavfilter virtual input device
  E opengl          OpenGL output
  E sdl,sdl2        SDL2 output device

For details about what devices are available on which OS, please check the official FFmpeg documentation. https://ffmpeg.org/ffmpeg-devices.html

Playing audio

By providing format="audiotoolbox" option, the StreamWriter writes data to speaker device.

# Prepare sample audio
waveform, sample_rate = torchaudio.load(AUDIO_PATH, channels_first=False, normalize=False)
num_frames, num_channels = waveform.shape

# Configure StreamWriter to write to speaker device
s = StreamWriter(dst="-", format="audiotoolbox")
s.add_audio_stream(sample_rate, num_channels, format="s16")

# Write audio to the device
with s.open():
    for i in range(0, num_frames, 256):
        s.write_audio_chunk(0, waveform[i:i+256])

Note

Writing to “audiotoolbox” is blocking operation, but it will not wait for the aduio playback. The device must be kept open while audio is being played.

The following code will close the device as soon as the audio is written and before the playback is completed. Adding time.sleep() will help keep the device open until the playback is completed.

with s.open():
    s.write_audio_chunk(0, waveform)

Playing Video

To play video, you can use format="sdl" or format="opengl". Again, you need a version of FFmpeg with corresponding integration enabled. The available devices can be checked with ffmpeg -devices.

Here, we use SDL device (https://ffmpeg.org/ffmpeg-devices.html#sdl).

# note:
#  SDL device does not support specifying frame rate, and it has to
#  match the refresh rate of display.
frame_rate = 120
width, height = 640, 360

For we define a helper function that delegates the video loading to a background thread and give chunks

running = True
def video_streamer(path, frames_per_chunk):
    import queue, threading
    from torchaudio.io import StreamReader

    q = queue.Queue()

    # Streaming process that runs in background thread
    def _streamer():
        streamer = StreamReader(path)
        streamer.add_basic_video_stream(
            frames_per_chunk, format="rgb24",
            frame_rate=frame_rate, width=width, height=height)
        for (chunk_, ) in streamer.stream():
            q.put(chunk_)
            if not running:
                break

    # Start the background thread and fetch chunks
    t = threading.Thread(target=_streamer)
    t.start()
    while running:
        try:
            yield q.get()
        except queue.Empty:
            break
    t.join()

Now we start streaming. Pressing “Q” will stop the video.

Note

write_video_chunk call against SDL device blocks until SDL finishes playing the video.

# Set output device to SDL
s = StreamWriter("-", format="sdl")

# Configure video stream (RGB24)
s.add_video_stream(frame_rate, width, height, format="rgb24", encoder_format="rgb24")

# Play the video
with s.open():
    for chunk in video_streamer(VIDEO_PATH, frames_per_chunk=256):
        try:
            s.write_video_chunk(0, chunk)
        except RuntimeError:
            running = False
            break

[code]

Streaming Video

So far, we looked at how to write to hardware devices. There are some alternative methods for video streaming.

RTMP (Real-Time Messaging Protocol)

Using RMTP, you can stream media (video and/or audio) to a single client. This does not require a hardware device, but it requires a separate player.

To use RMTP, specify the protocol and route in dst argument in StreamWriter constructor, then pass {"listen": "1"} option when opening the destination.

StreamWriter will listen to the port and wait for a client to request the video. The call to open is blocked until a request is received.

s = StreamWriter(dst="rtmp://localhost:1935/live/app", format="flv")
s.add_audio_stream(sample_rate=sample_rate, num_channels=num_channels, encoder="aac")
s.add_video_stream(frame_rate=frame_rate, width=width, height=height)

with s.open(option={"listen": "1"}):
    for video_chunk, audio_chunk in generator():
        s.write_audio_chunk(0, audio_chunk)
        s.write_video_chunk(1, video_chunk)

[code]

UDP (User Datagram Protocol)

Using UDP, you can stream media (video and/or audio) to socket. This does not require a hardware device, but it requires a separate player.

Unlike RTMP streaming and client processes are disconnected. The streaming process are not aware of client process.

s = StreamWriter(dst="udp://localhost:48550", format="mpegts")
s.add_audio_stream(sample_rate=sample_rate, num_channels=num_channels, encoder="aac")
s.add_video_stream(frame_rate=frame_rate, width=width, height=height)

with s.open():
    for video_chunk, audio_chunk in generator():
        s.write_audio_chunk(0, audio_chunk)
        s.write_video_chunk(1, video_chunk)

[code]

Tag: torchaudio.io

Total running time of the script: ( 0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery

StreamWriter Advanced Usage

Preparation

Device Availability

Playing audio

Playing Video

Streaming Video

RTMP (Real-Time Messaging Protocol)

UDP (User Datagram Protocol)

Docs

Tutorials

Resources