Note
Click here to download the full example code
Accelerated video encoding with NVENC¶
Author: Moto Hira
This tutorial shows how to use NVIDIA’s hardware video encoder (NVENC) with TorchAudio, and how it improves the performance of video encoding.
Note
This tutorial requires FFmpeg libraries compiled with HW acceleration enabled.
Please refer to Enabling GPU video decoder/encoder for how to build FFmpeg with HW acceleration.
Note
Most modern GPUs have both HW decoder and encoder, but some highend GPUs like A100 and H100 do not have HW encoder. Please refer to the following for the availability and format coverage. https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new
Attempting to use HW encoder on these GPUs fails with an error
message like Generic error in an external library
.
You can enable debug log with
torchaudio.utils.ffmpeg_utils.set_log_level()
to see more
detailed error messages issued along the way.
import torch
import torchaudio
print(torch.__version__)
print(torchaudio.__version__)
import io
import time
import matplotlib.pyplot as plt
from IPython.display import Video
from torchaudio.io import StreamReader, StreamWriter
2.4.0
2.4.0
Check the prerequisites¶
First, we check that TorchAudio correctly detects FFmpeg libraries that support HW decoder/encoder.
from torchaudio.utils import ffmpeg_utils
FFmpeg Library versions:
libavcodec: 60.3.100
libavdevice: 60.1.100
libavfilter: 9.3.100
libavformat: 60.3.100
libavutil: 58.2.100
Available NVENC Encoders:
- av1_nvenc
- h264_nvenc
- hevc_nvenc
print("Avaialbe GPU:")
print(torch.cuda.get_device_properties(0))
Avaialbe GPU:
_CudaDeviceProperties(name='NVIDIA A10G', major=8, minor=6, total_memory=22502MB, multi_processor_count=80)
We use the following helper function to generate test frame data. For the detail of synthetic video generation please refer to StreamReader Advanced Usage.
def get_data(height, width, format="yuv444p", frame_rate=30000 / 1001, duration=4):
src = f"testsrc2=rate={frame_rate}:size={width}x{height}:duration={duration}"
s = StreamReader(src=src, format="lavfi")
s.add_basic_video_stream(-1, format=format)
s.process_all_packets()
(video,) = s.pop_chunks()
return video
Encoding videos with NVENC¶
To use HW video encoder, you need to specify the HW encoder when
defining the output video stream by providing encoder
option to
add_video_stream()
.
pict_config = {
"height": 360,
"width": 640,
"frame_rate": 30000 / 1001,
"format": "yuv444p",
}
frame_data = get_data(**pict_config)
w = StreamWriter(io.BytesIO(), format="mp4")
w.add_video_stream(**pict_config, encoder="h264_nvenc", encoder_format="yuv444p")
with w.open():
w.write_video_chunk(0, frame_data)
Similar to the HW decoder, by default, the encoder expects the frame
data to be on CPU memory. To send data from CUDA memory, you need to
specify hw_accel
option.
buffer = io.BytesIO()
w = StreamWriter(buffer, format="mp4")
w.add_video_stream(**pict_config, encoder="h264_nvenc", encoder_format="yuv444p", hw_accel="cuda:0")
with w.open():
w.write_video_chunk(0, frame_data.to(torch.device("cuda:0")))
buffer.seek(0)
video_cuda = buffer.read()
Video(video_cuda, embed=True, mimetype="video/mp4")