Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Video file and Video clip, Video frame models and operations with them #797

Open
shcheklein opened this issue Jan 7, 2025 · 2 comments · May be fixed by #814
Open

Support Video file and Video clip, Video frame models and operations with them #797

shcheklein opened this issue Jan 7, 2025 · 2 comments · May be fixed by #814
Assignees

Comments

@shcheklein
Copy link
Member

  • It should include methods like:
    • getting metadata
    • split into subvideos
    • split into frames

We need to enable visualizations - e.g. when we have a frame we want to open the video on that frame and show bounding boxes, etc.

@dmpetrov has some examples in one of the teams + @dreadatour was doing some of this already + we have video pose detection tutorial in datachain-examples. We need to consolidate and do proper APIs + Studio support.

APIs should work in streaming mode.

@shcheklein shcheklein changed the title Support Video file and Video clip models and operations with them Support Video file and Video clip, Video frame models and operations with them Jan 7, 2025
@shcheklein
Copy link
Member Author

First step in this ticket is describe the API.

@dreadatour dreadatour self-assigned this Jan 7, 2025
@dreadatour dreadatour linked a pull request Jan 13, 2025 that will close this issue
3 tasks
@dreadatour dreadatour linked a pull request Jan 13, 2025 that will close this issue
3 tasks
@dreadatour
Copy link
Contributor

API

Already implemented here: #814

I was trying to describe API without implementation and it was quite hard without real usage examples.

Video Meta

class VideoFile(File):
    """`DataModel` for reading video files."""


class VideoMeta(DataModel):
    """`DataModel` for video file meta information."""

    width: int
    height: int
    fps: float
    duration: float
    frames_count: int
    codec: str
def video_meta(file: "VideoFile") -> VideoMeta:
    """
    Returns video file meta information.

    Args:
        file (VideoFile): VideoFile object.

    Returns:
        VideoMeta: Video file meta information.
    """

Usage example:

from datachain import DataChain
from datachain.lib.video import video_meta

DataChain.from_dataset("videos").map(meta=video_meta).save("videos-meta")

Get video frame

def video_frame_np(file: "VideoFile", frame: int) -> "ndarray":
    """
    Reads video frame from a file.

    Args:
        file (VideoFile): VideoFile object.
        frame (int): Frame number to read.

    Returns:
        ndarray: Video frame.
    """


def video_frame(file: "VideoFile", frame: int, format: str = "jpeg") -> bytes:
    """
    Reads video frame from a file and returns as image bytes.

    Args:
        file (VideoFile): VideoFile object.
        frame (int): Frame number to read.
        format (str): Image format (default: 'jpeg').

    Returns:
        bytes: Video frame image as bytes.
    """


def save_video_frame(
    file: "VideoFile",
    frame: int,
    output_file: Union[str, pathlib.Path],
    format: str = "jpeg",
) -> None:
    """
    Saves video frame as an image file.

    Args:
        file (VideoFile): VideoFile object.
        frame (int): Frame number to read.
        output_file (Union[str, pathlib.Path]): Output file path.
        format (str): Image format (default: 'jpeg').
    """

Get video frames

def video_frames_np(
    file: "VideoFile",
    start_frame: int = 0,
    end_frame: Optional[int] = None,
    step: int = 1,
) -> "Iterator[ndarray]":
    """
    Reads video frames from a file.

    Args:
        file (VideoFile): VideoFile object.
        start_frame (int): Frame number to start reading from (default: 0).
        end_frame (int): Frame number to stop reading at (default: None).
        step (int): Step size for reading frames (default: 1).

    Returns:
        Iterator[ndarray]: Iterator of video frames.
    """


def video_frames(
    file: "VideoFile",
    start_frame: int = 0,
    end_frame: Optional[int] = None,
    step: int = 1,
    format: str = "jpeg",
) -> "Iterator[bytes]":
    """
    Reads video frames from a file and returns as bytes.

    Args:
        file (VideoFile): VideoFile object.
        start_frame (int): Frame number to start reading from (default: 0).
        end_frame (int): Frame number to stop reading at (default: None).
        step (int): Step size for reading frames (default: 1).
        format (str): Image format (default: 'jpeg').

    Returns:
        Iterator[bytes]: Iterator of video frames.
    """


def save_video_frames(
    file: "VideoFile",
    output_dir: Union[str, pathlib.Path],
    start_frame: int = 0,
    end_frame: Optional[int] = None,
    step: int = 1,
    format: str = "jpeg",
) -> "Iterator[str]":
    """
    Saves video frames as image files.

    Args:
        file (VideoFile): VideoFile object.
        output_dir (Union[str, pathlib.Path]): Output directory path.
        start_frame (int): Frame number to start reading from (default: 0).
        end_frame (int): Frame number to stop reading at (default: None).
        step (int): Step size for reading frames (default: 1).
        format (str): Image format (default: 'jpeg').

    Returns:
        Iterator[str]: List of output file paths.
    """

Video clips

def save_video_clip(
    file: "VideoFile",
    start_time: float,
    end_time: float,
    output_file: Union[str, pathlib.Path],
    codec: str = "libx264",
    audio_codec: str = "aac",
) -> None:
    """
    Saves video interval as a new video file.

    Args:
        file (VideoFile): VideoFile object.
        start_time (float): Start time in seconds.
        end_time (float): End time in seconds.
        output_file (Union[str, pathlib.Path]): Output file path.
        codec (str): Video codec for encoding (default: 'libx264').
        audio_codec (str): Audio codec for encoding (default: 'aac').
    """


def save_video_clips(
    file: "VideoFile",
    intervals: list[tuple[float, float]],
    output_dir: Union[str, pathlib.Path],
    codec: str = "libx264",
    audio_codec: str = "aac",
) -> "Iterator[str]":
    """
    Saves video interval as a new video file.

    Args:
        file (VideoFile): VideoFile object.
        intervals (list[tuple[float, float]]): List of start and end times in seconds.
        output_dir (Union[str, pathlib.Path]): Output directory path.
        codec (str): Video codec for encoding (default: 'libx264').
        audio_codec (str): Audio codec for encoding (default: 'aac').

    Returns:
        Iterator[str]: List of output file paths.
    """

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants