-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add video models + functions #814
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #814 +/- ##
==========================================
- Coverage 87.42% 86.84% -0.59%
==========================================
Files 128 129 +1
Lines 11373 11479 +106
Branches 1537 1553 +16
==========================================
+ Hits 9943 9969 +26
- Misses 1049 1128 +79
- Partials 381 382 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing PR!
It would be great to use concise and minimalistic naming and API because we are going to have many file types for multiple domains.
- Naming
Keywords like Meta will make it hard for user to remember and use the classes - user have their own meta 🙂
How about this renaming:
VideoFile -> BaseVideo (I assume people won't use this often)
VideoMeta -> Video (the most used class)
VideoClip -> Clip (also, shouldn't it be based on Video with meta?)
VideoFrame -> FrameBase
VideoFrameMeta -> Frame
start_time --> start
end_time --> end
frames_count --> count
Image -> BaseImage
ImageMeta -> Image
FileTypes can be also extended: image
(read meta), base_image
(do not read meta), video
(read meta), base_video
(do not read meta), video_clip
, base_video_clip
, ...
- Do we need dummy classes?
I assume that people prefer working with meta information while dealing with images and videos. A followup question - do we really need BaseImages
and BaseVideo
without any logic? Why don't we clean up API and keep only Meta-enrich version in the API? User still can work with videos as File if meta is not needed.
- Do we need singular methods?
save_video_clips()
and save_video_clip()
How much extra code user needs to get rid of singular form. If one method - let's avoid the singular version.
The same question for video_frames()
and video_frames_np()
I assume, we can add the method and classes later if there is a need. But I'd not start with such rich API for now and try my best to keep in minimalistic.
WDYT?
src/datachain/lib/file.py
Outdated
|
||
width: int | ||
height: int | ||
format: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about EXIF and XMP? :)
yield img | ||
|
||
|
||
def video_frames( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can a lot of these helpers become part of the Video*
classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question 👍 I was thinking about this and tried to implement it this way, but in the end I've checked other types and files in lib
module (images, hf) and make it the same way.
I was also thinking and trying to move all the models to the datachain.model
module, but it turns out it needs more work and may be not backward compatible with File
model. In is a subject for a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we need all of theses to become methods of Video class. Should it be a followup or in this PR?
I'd appreciate more insights on the issues with this approach.
src/datachain/lib/video.py
Outdated
props = iio.improps(file.stream(), plugin="pyav") | ||
frames_count, width, height, _ = props.shape | ||
|
||
meta = iio.immeta(file.stream(), plugin="pyav") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this part, it looks like we are reading video file twice here. Need to check the other way to get video meta information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, also are we reading the whole file to get meta?
Deploying datachain-documentation with Cloudflare Pages
|
for more information, see https://pre-commit.ci
👍
For now we have naming with
Done. Only
We don't have
That's good suggestion, only we use
Good question. I've added def video_meta(file: "VideoFile") -> Video:
"""
Returns video file meta information.
Args:
file (VideoFile): VideoFile object.
Returns:
Video: Video file meta information.
"""
Sounds reasonable to me 👍 Will update the code (not done yet).
Done.
Those are great comments! Love the discussion ❤️ |
"""`DataModel` for reading video files.""" | ||
|
||
|
||
class VideoClip(VideoFile): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, how are these all modes connected with the helpers? how do I instantiate them? do I have to write my own UDFs to do that (just instantiate these classes?)
|
||
def save(self, destination: str): | ||
"""Writes it's content to destination""" | ||
self.read().save(destination) | ||
|
||
|
||
class Image(DataModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this separate model?
timestamp: float = Field(default=0) | ||
|
||
|
||
class Video(DataModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be a subclass of VideoFile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great improvements.
A few followup questions about moving the methods to Video class and a plural-singular method.
class VideoClip(VideoFile): | ||
"""`DataModel` for reading video clips.""" | ||
|
||
start: float = Field(default=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use some impossible value like -1.0
"""`DataModel` for reading video frames.""" | ||
|
||
frame: int = Field(default=0) | ||
timestamp: float = Field(default=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1 and -1.0 as defaults?
) from exc | ||
|
||
|
||
def video_meta(file: "VideoFile") -> Video: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please avoid using erm meta
? How about file_to_video(file: File)
?
Btw... not just File as input type?
return iio.imread(file.stream(), index=frame, plugin="pyav") | ||
|
||
|
||
def video_frame(file: "VideoFile", frame: int, format: str = "jpeg") -> bytes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we usually use jpg in the codebase, not jpeg
file: "VideoFile", | ||
frame: int, | ||
output_file: Union[str, pathlib.Path], | ||
format: str = "jpeg", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jpg
def save_video_frame( | ||
file: "VideoFile", | ||
frame: int, | ||
output_file: Union[str, pathlib.Path], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we really support Path?
yield img | ||
|
||
|
||
def video_frames( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we need all of theses to become methods of Video class. Should it be a followup or in this PR?
I'd appreciate more insights on the issues with this approach.
start_frame: int = 0, | ||
end_frame: Optional[int] = None, | ||
step: int = 1, | ||
format: str = "jpeg", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jpg
yield output_file | ||
|
||
|
||
def save_video_clip( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it needs to be renamed to save_subvideo()
In the class names, we use term Clip
for virtual videos (start-end) while in this case you are creating just another Video, not clip.
So, it needs to be renamed or we need to avoid this Clip-as-virtual-reference terminology.
output_file: Union[str, pathlib.Path], | ||
codec: str = "libx264", | ||
audio_codec: str = "aac", | ||
) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to generalize the single and plural methods. We just need to come up with output format like output="{name}{:06d}.{ext}")
and provide a string in case of a single file.
Also, this method will require generalization for writing to cloud like output={source}/tmp/{name}{:06d}.{ext}
See #797
TODO:
Video models added
Meta models added
Couple usage examples
Listing
Add meta
Split video to virtual frames
Split video into frames and upload to storage