.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "generated_examples/basic_example.py" .. LINE NUMBERS ARE GIVEN BELOW. .. rst-class:: sphx-glr-example-title .. _sphx_glr_generated_examples_basic_example.py: ======================================== Decoding a video with VideoDecoder ======================================== In this example, we'll learn how to decode a video using the :class:`~torchcodec.decoders.VideoDecoder` class. .. GENERATED FROM PYTHON SOURCE LINES 17-20 First, a bit of boilerplate: we'll download a video from the web, and define a plotting utility. You can ignore that part and jump right below to :ref:`creating_decoder`. .. GENERATED FROM PYTHON SOURCE LINES 20-54 .. code-block:: Python from typing import Optional import torch import requests # Video source: https://www.pexels.com/video/dog-eating-854132/ # License: CC0. Author: Coverr. url = "https://videos.pexels.com/video-files/854132/854132-sd_640_360_25fps.mp4" response = requests.get(url, headers={"User-Agent": ""}) if response.status_code != 200: raise RuntimeError(f"Failed to download video. {response.status_code = }.") raw_video_bytes = response.content def plot(frames: torch.Tensor, title : Optional[str] = None): try: from torchvision.utils import make_grid from torchvision.transforms.v2.functional import to_pil_image import matplotlib.pyplot as plt except ImportError: print("Cannot plot, please run `pip install torchvision matplotlib`") return plt.rcParams["savefig.bbox"] = 'tight' fig, ax = plt.subplots() ax.imshow(to_pil_image(make_grid(frames))) ax.set(xticklabels=[], yticklabels=[], xticks=[], yticks=[]) if title is not None: ax.set_title(title) plt.tight_layout() .. GENERATED FROM PYTHON SOURCE LINES 55-63 .. _creating_decoder: Creating a decoder ------------------ We can now create a decoder from the raw (encoded) video bytes. You can of course use a local video file and pass the path as input, rather than download a video. .. GENERATED FROM PYTHON SOURCE LINES 63-68 .. code-block:: Python from torchcodec.decoders import VideoDecoder # You can also pass a path to a local file! decoder = VideoDecoder(raw_video_bytes) .. GENERATED FROM PYTHON SOURCE LINES 69-72 The has not yet been decoded by the decoder, but we already have access to some metadata via the ``metadata`` attribute which is a :class:`~torchcodec.decoders.VideoStreamMetadata` object. .. GENERATED FROM PYTHON SOURCE LINES 72-74 .. code-block:: Python print(decoder.metadata) .. rst-class:: sphx-glr-script-out .. code-block:: none VideoStreamMetadata: num_frames: 345 duration_seconds: 13.8 average_fps: 25.0 duration_seconds_from_header: 13.8 bit_rate: 505790.0 num_frames_from_header: 345 num_frames_from_content: 345 begin_stream_seconds: 0.0 end_stream_seconds: 13.8 codec: h264 width: 640 height: 360 average_fps_from_header: 25.0 stream_index: 0 .. GENERATED FROM PYTHON SOURCE LINES 75-77 Decoding frames by indexing the decoder --------------------------------------- .. GENERATED FROM PYTHON SOURCE LINES 77-86 .. code-block:: Python first_frame = decoder[0] # using a single int index every_twenty_frame = decoder[0 : -1 : 20] # using slices print(f"{first_frame.shape = }") print(f"{first_frame.dtype = }") print(f"{every_twenty_frame.shape = }") print(f"{every_twenty_frame.dtype = }") .. rst-class:: sphx-glr-script-out .. code-block:: none first_frame.shape = torch.Size([3, 360, 640]) first_frame.dtype = torch.uint8 every_twenty_frame.shape = torch.Size([18, 3, 360, 640]) every_twenty_frame.dtype = torch.uint8 .. GENERATED FROM PYTHON SOURCE LINES 87-96 Indexing the decoder returns the frames as :class:`torch.Tensor` objects. By default, the shape of the frames is ``(N, C, H, W)`` where N is the batch size C the number of channels, H is the height, and W is the width of the frames. The batch dimension N is only present when we're decoding more than one frame. The dimension order can be changed to ``N, H, W, C`` using the ``dimension_order`` parameter of :class:`~torchcodec.decoders.VideoDecoder`. Frames are always of ``torch.uint8`` dtype. .. GENERATED FROM PYTHON SOURCE LINES 96-99 .. code-block:: Python plot(first_frame, "First frame") .. image-sg:: /generated_examples/images/sphx_glr_basic_example_001.png :alt: First frame :srcset: /generated_examples/images/sphx_glr_basic_example_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 100-102 .. code-block:: Python plot(every_twenty_frame, "Every 20 frame") .. image-sg:: /generated_examples/images/sphx_glr_basic_example_002.png :alt: Every 20 frame :srcset: /generated_examples/images/sphx_glr_basic_example_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 103-107 Iterating over frames --------------------- The decoder is a normal iterable object and can be iterated over like so: .. GENERATED FROM PYTHON SOURCE LINES 107-114 .. code-block:: Python for frame in decoder: assert ( isinstance(frame, torch.Tensor) and frame.shape == (3, decoder.metadata.height, decoder.metadata.width) ) .. GENERATED FROM PYTHON SOURCE LINES 115-126 Retrieving pts and duration of frames ------------------------------------- Indexing the decoder returns pure :class:`torch.Tensor` objects. Sometimes, it can be useful to retrieve additional information about the frames, such as their :term:`pts` (Presentation Time Stamp), and their duration. This can be achieved using the :meth:`~torchcodec.decoders.VideoDecoder.get_frame_at` and :meth:`~torchcodec.decoders.VideoDecoder.get_frames_at` methods, which will return a :class:`~torchcodec.Frame` and :class:`~torchcodec.FrameBatch` objects respectively. .. GENERATED FROM PYTHON SOURCE LINES 126-131 .. code-block:: Python last_frame = decoder.get_frame_at(len(decoder) - 1) print(f"{type(last_frame) = }") print(last_frame) .. rst-class:: sphx-glr-script-out .. code-block:: none type(last_frame) = <class 'torchcodec._frame.Frame'> Frame: data (shape): torch.Size([3, 360, 640]) pts_seconds: 13.76 duration_seconds: 0.04 .. GENERATED FROM PYTHON SOURCE LINES 132-136 .. code-block:: Python other_frames = decoder.get_frames_at([10, 0, 50]) print(f"{type(other_frames) = }") print(other_frames) .. rst-class:: sphx-glr-script-out .. code-block:: none type(other_frames) = <class 'torchcodec._frame.FrameBatch'> FrameBatch: data (shape): torch.Size([3, 3, 360, 640]) pts_seconds: tensor([0.4000, 0.0000, 2.0000], dtype=torch.float64) duration_seconds: tensor([0.0400, 0.0400, 0.0400], dtype=torch.float64) .. GENERATED FROM PYTHON SOURCE LINES 137-140 .. code-block:: Python plot(last_frame.data, "Last frame") plot(other_frames.data, "Other frames") .. rst-class:: sphx-glr-horizontal * .. image-sg:: /generated_examples/images/sphx_glr_basic_example_003.png :alt: Last frame :srcset: /generated_examples/images/sphx_glr_basic_example_003.png :class: sphx-glr-multi-img * .. image-sg:: /generated_examples/images/sphx_glr_basic_example_004.png :alt: Other frames :srcset: /generated_examples/images/sphx_glr_basic_example_004.png :class: sphx-glr-multi-img .. GENERATED FROM PYTHON SOURCE LINES 141-147 Both :class:`~torchcodec.Frame` and :class:`~torchcodec.FrameBatch` have a ``data`` field, which contains the decoded tensor data. They also have the ``pts_seconds`` and ``duration_seconds`` fields which are single ints for :class:`~torchcodec.Frame`, and 1-D :class:`torch.Tensor` for :class:`~torchcodec.FrameBatch` (one value per frame in the batch). .. GENERATED FROM PYTHON SOURCE LINES 149-158 Using time-based indexing ------------------------- So far, we have retrieved frames based on their index. We can also retrieve frames based on *when* they are played with :meth:`~torchcodec.decoders.VideoDecoder.get_frame_played_at` and :meth:`~torchcodec.decoders.VideoDecoder.get_frames_played_at`, which also returns :class:`~torchcodec.Frame` and :class:`~torchcodec.FrameBatch` respectively. .. GENERATED FROM PYTHON SOURCE LINES 158-163 .. code-block:: Python frame_at_2_seconds = decoder.get_frame_played_at(seconds=2) print(f"{type(frame_at_2_seconds) = }") print(frame_at_2_seconds) .. rst-class:: sphx-glr-script-out .. code-block:: none type(frame_at_2_seconds) = <class 'torchcodec._frame.Frame'> Frame: data (shape): torch.Size([3, 360, 640]) pts_seconds: 2.0 duration_seconds: 0.04 .. GENERATED FROM PYTHON SOURCE LINES 164-168 .. code-block:: Python other_frames = decoder.get_frames_played_at(seconds=[10.1, 0.3, 5]) print(f"{type(other_frames) = }") print(other_frames) .. rst-class:: sphx-glr-script-out .. code-block:: none type(other_frames) = <class 'torchcodec._frame.FrameBatch'> FrameBatch: data (shape): torch.Size([3, 3, 360, 640]) pts_seconds: tensor([10.0800, 0.2800, 5.0000], dtype=torch.float64) duration_seconds: tensor([0.0400, 0.0400, 0.0400], dtype=torch.float64) .. GENERATED FROM PYTHON SOURCE LINES 169-171 .. code-block:: Python plot(frame_at_2_seconds.data, "Frame played at 2 seconds") plot(other_frames.data, "Other frames") .. rst-class:: sphx-glr-horizontal * .. image-sg:: /generated_examples/images/sphx_glr_basic_example_005.png :alt: Frame played at 2 seconds :srcset: /generated_examples/images/sphx_glr_basic_example_005.png :class: sphx-glr-multi-img * .. image-sg:: /generated_examples/images/sphx_glr_basic_example_006.png :alt: Other frames :srcset: /generated_examples/images/sphx_glr_basic_example_006.png :class: sphx-glr-multi-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 3.146 seconds) .. _sphx_glr_download_generated_examples_basic_example.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: basic_example.ipynb <basic_example.ipynb>` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: basic_example.py <basic_example.py>` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: basic_example.zip <basic_example.zip>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_