Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Video models #890

Merged
merged 69 commits into from
Feb 6, 2025
Merged

Video models #890

merged 69 commits into from
Feb 6, 2025

Conversation

dreadatour
Copy link
Contributor

@dreadatour dreadatour commented Feb 3, 2025

Alternative approach to implement video models based on this comment. Looks much cleaner.

New VideoFile model

class VideoFile(File):
    """
    A data model for handling video files.

    This model inherits from the `File` model and provides additional functionality
    for reading video files, extracting video frames, and splitting videos into
    fragments.
    """

    def get_info(self) -> "Video":
        """
        Retrieves metadata and information about the video file.

        Returns:
            Video: A Model containing video metadata such as duration,
                   resolution, frame rate, and codec details.
        """

    def get_frame(self, frame: int) -> "VideoFrame":
        """
        Returns a specific video frame by its frame number.

        Args:
            frame (int): The frame number to read.

        Returns:
            VideoFrame: Video frame model.
        """

    def get_frames(
        self,
        start: int = 0,
        end: Optional[int] = None,
        step: int = 1,
    ) -> "Iterator[VideoFrame]":
        """
        Returns video frames from the specified range in the video.

        Args:
            start (int): The starting frame number (default: 0).
            end (int, optional): The ending frame number (exclusive). If None,
                                 frames are read until the end of the video
                                 (default: None).
            step (int): The interval between frames to read (default: 1).

        Returns:
            Iterator[VideoFrame]: An iterator yielding video frames.

        Note:
            If end is not specified, number of frames will be taken from the video file,
            this means video file needs to be downloaded.
        """

    def get_fragment(self, start: float, end: float) -> "VideoFragment":
        """
        Returns a video fragment from the specified time range.

        Args:
            start (float): The start time of the fragment in seconds.
            end (float): The end time of the fragment in seconds.

        Returns:
            VideoFragment: A Model representing the video fragment.
        """

    def get_fragments(
        self,
        duration: float,
        start: float = 0,
        end: Optional[float] = None,
    ) -> "Iterator[VideoFragment]":
        """
        Splits the video into multiple fragments of a specified duration.

        Args:
            duration (float): The duration of each video fragment in seconds.
            start (float): The starting time in seconds (default: 0).
            end (float, optional): The ending time in seconds. If None, the entire
                                   remaining video is processed (default: None).

        Returns:
            Iterator[VideoFragment]: An iterator yielding video fragments.

        Note:
            If end is not specified, number of frames will be taken from the video file,
            this means video file needs to be downloaded.
        """

New VideoFrame model

One can create VideoFrame without downloading video file, since it is "virtual" frame: original VideoFile + frame number.

If physical frame image is needed, call save method, which uploads frame image into storage and returns ImageFile new model.

API:

class VideoFrame(DataModel):
    """
    A data model for representing a video frame.

    This model inherits from the `VideoFile` model and adds a `frame` attribute,
    which represents a specific frame within a video file. It allows access
    to individual frames and provides functionality for reading and saving
    video frames as image files.

    Attributes:
        video (VideoFile): The video file containing the video frame.
        frame (int): The frame number referencing a specific frame in the video file.
    """

    video: VideoFile
    frame: int

    def get_np(self) -> "ndarray":
        """
        Returns a video frame from the video file as a NumPy array.

        Returns:
            ndarray: A NumPy array representing the video frame,
                     in the shape (height, width, channels).
        """

    def read_bytes(self, format: str = "jpg") -> bytes:
        """
        Returns a video frame from the video file as image bytes.

        Args:
            format (str): The desired image format (e.g., 'jpg', 'png').
                          Defaults to 'jpg'.

        Returns:
            bytes: The encoded video frame as image bytes.
        """

    def save(self, output: str, format: str = "jpg") -> "ImageFile":
        """
        Saves the current video frame as an image file.

        If `output` is a remote path, the image file will be uploaded to remote storage.

        Args:
            output (str): The destination path, which can be a local file path
                          or a remote URL.
            format (str): The image format (e.g., 'jpg', 'png'). Defaults to 'jpg'.

        Returns:
            ImageFile: A Model representing the saved image file.
        """

New VideoFragment model

One can create VideoFragment without downloading video file, since it is "virtual" fragment: original video file + start/end timestamp.

If physical fragment video is needed, call save method, which uploads fragment video into storage and returns new VideoFile model.

API:

class VideoFragment(DataModel):
    """
    A data model for representing a video fragment.

    This model inherits from the `VideoFile` model and adds `start`
    and `end` attributes, which represent a specific fragment within a video file.
    It allows access to individual fragments and provides functionality for reading
    and saving video fragments as separate video files.

    Attributes:
        video (VideoFile): The video file containing the video fragment.
        start (float): The starting time of the video fragment in seconds.
        end (float): The ending time of the video fragment in seconds.
    """

    video: VideoFile
    start: float
    end: float

    def save(self, output: str, format: Optional[str] = None) -> "VideoFile":
        """
        Saves the video fragment as a new video file.

        If `output` is a remote path, the video file will be uploaded to remote storage.

        Args:
            output (str): The destination path, which can be a local file path
                          or a remote URL.
            format (str, optional): The output video format (e.g., 'mp4', 'avi').
                                    If None, the format is inferred from the
                                    file extension.

        Returns:
            VideoFile: A Model representing the saved video file.
        """

New Video model

Video file meta information.

class Video(DataModel):
    """
    A data model representing metadata for a video file.

    Attributes:
        width (int): The width of the video in pixels. Defaults to -1 if unknown.
        height (int): The height of the video in pixels. Defaults to -1 if unknown.
        fps (float): The frame rate of the video (frames per second).
                     Defaults to -1.0 if unknown.
        duration (float): The total duration of the video in seconds.
                          Defaults to -1.0 if unknown.
        frames (int): The total number of frames in the video.
                      Defaults to -1 if unknown.
        format (str): The format of the video file (e.g., 'mp4', 'avi').
                      Defaults to an empty string.
        codec (str): The codec used for encoding the video. Defaults to an empty string.
    """

    width: int = Field(default=-1)
    height: int = Field(default=-1)
    fps: float = Field(default=-1.0)
    duration: float = Field(default=-1.0)
    frames: int = Field(default=-1)
    format: str = Field(default="")
    codec: str = Field(default="")

dreadatour and others added 30 commits January 13, 2025 23:48
* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.8.6 → v0.9.1](astral-sh/ruff-pre-commit@v0.8.6...v0.9.1)

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Bumps [ultralytics](https://github.com/ultralytics/ultralytics) from 8.3.58 to 8.3.61.
- [Release notes](https://github.com/ultralytics/ultralytics/releases)
- [Commits](ultralytics/ultralytics@v8.3.58...v8.3.61)

---
updated-dependencies:
- dependency-name: ultralytics
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Review help/usage for cli commands

The pattern followed is:
- Descriptions: Complete sentences with periods
- Help messages: Concise phrases without periods
- Consistent terminology ("Iterative Studio")
- Clear, standardized format for similar arguments

* Bring uniformity for Studio mention

* Override default command failure

* Remove datasets from studio

* Fix anon message and remove edatachain message

* dirs to directories

* Remove studio dataset test
* prefetching: remove prefetched item after use in udf

This PR removes the prefetched item after use in the UDF.
This is enabled by default on `prefetch>0`, unless `cache=True` is set in the UDF, in
which case the prefetched item is not removed.

For pytorch dataloader, this is not enabled by default, but can be enabled by setting
`remove_prefetched=True` in the `PytorchDataset` class.
This is done so because the dataset can be used in multiple epochs, and removing the
prefetched item after use can cause it to redownload again in the next epoch.

The exposed `remove_prefetched=True|False` setting could be renamed to
some better option. Feedbacks are welcome.

* close iterable properly
* added main logic for outer join

* fixing filters

* removign datasetquery tests and added more datachain unit tests
If usearch fails to download the extension, it will keep retrying in the
future. This adds significant cost - for example, in `tests/func/test_pytorch.py`
run, it was invoked 111 times, taking ~30 seconds in total.

Now, we cache the return value for the whole session.
Added `isnone()` function
* move tests using cloud_test_catalog into func directory

* move tests using tmpfile catalog

* move long running tests that read/write from disk
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.9.1 → v0.9.2](astral-sh/ruff-pre-commit@v0.9.1...v0.9.2)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Bumps [ultralytics](https://github.com/ultralytics/ultralytics) from 8.3.61 to 8.3.64.
- [Release notes](https://github.com/ultralytics/ultralytics/releases)
- [Commits](ultralytics/ultralytics@v8.3.61...v8.3.64)

---
updated-dependencies:
- dependency-name: ultralytics
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.22 to 9.5.50.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](squidfunk/mkdocs-material@9.5.22...9.5.50)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@dreadatour dreadatour requested review from shcheklein, dmpetrov, mattseddon and a team February 3, 2025 18:36
@dreadatour dreadatour self-assigned this Feb 3, 2025
Copy link

codecov bot commented Feb 3, 2025

Codecov Report

Attention: Patch coverage is 87.50000% with 19 lines in your changes missing coverage. Please review.

Project coverage is 87.72%. Comparing base (b95cc76) to head (94d363c).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/datachain/lib/video.py 79.51% 11 Missing and 6 partials ⚠️
src/datachain/lib/file.py 97.10% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main     #890    +/-   ##
========================================
  Coverage   87.72%   87.72%            
========================================
  Files         129      130     +1     
  Lines       11492    11641   +149     
  Branches     1554     1579    +25     
========================================
+ Hits        10081    10212   +131     
- Misses       1022     1033    +11     
- Partials      389      396     +7     
Flag Coverage Δ
datachain 87.64% <87.50%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


return video_info(self)

def get_frame(self, frame: int) -> "VideoFrame":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but should these be to_ methods to match the DataChain class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but should these be to_ methods to match the DataChain class?

Looks reasonable 🤔 Although it is not a direct conversion ("to"), but rather getting a part of the file into another file, like "get frame from video" looks good to me, but "video to frame" looks odd. What do you think? I don't have strict opinion on this 🤔

@dreadatour dreadatour marked this pull request as draft February 4, 2025 05:32
Copy link

cloudflare-workers-and-pages bot commented Feb 4, 2025

Deploying datachain-documentation with  Cloudflare Pages  Cloudflare Pages

Latest commit: 94d363c
Status: ✅  Deploy successful!
Preview URL: https://a1a29e21.datachain-documentation.pages.dev
Branch Preview URL: https://video-models-2.datachain-documentation.pages.dev

View logs



@pytest.fixture(autouse=True)
def video_file(catalog) -> VideoFile:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[C] Some of these are probably func tests as they are writing/reading to/from disk.

src/datachain/lib/file.py Outdated Show resolved Hide resolved

return video_frame_bytes(self, format)

def save(self, output: str, format: str = "jpg") -> "ImageFile":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[C] Now that we have virtual models, we could add a video example in this repo.

src/datachain/lib/file.py Outdated Show resolved Hide resolved
src/datachain/lib/file.py Outdated Show resolved Hide resolved
src/datachain/lib/video.py Outdated Show resolved Hide resolved
src/datachain/lib/file.py Outdated Show resolved Hide resolved
src/datachain/toolkit/ultralytics.py Outdated Show resolved Hide resolved
src/datachain/toolkit/video.py Outdated Show resolved Hide resolved
if len(video_streams) == 0:
raise FileError(file, "no video streams found in video file")

video_stream = video_streams[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] Why take the first one? Is there only ever one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some video container formats (such as MPEG-4) support multiple video, audio, and subtitle streams. However, in practice, video files with multiple video streams are extremely rare. Most applications, including video players and streaming platforms, are not designed to handle multiple video streams because there is little practical use for them.

I have heard of a few specialized use cases, such as movies with different aspect ratios or sports event recordings featuring multiple camera angles. However, these are rare exceptions, and I have never encountered them in real-world scenarios. In most cases, it is much simpler to provide multiple separate video files and allow users to download or stream only the one they need.

@dreadatour dreadatour changed the title Video models (take 2) Video models Feb 6, 2025
@dreadatour dreadatour marked this pull request as ready for review February 6, 2025 16:04
@dreadatour dreadatour merged commit 86fc806 into main Feb 6, 2025
36 of 37 checks passed
@dreadatour dreadatour deleted the video-models-2 branch February 6, 2025 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants