-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazily init filtergraph so it can respect raw decoded resolution #432
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch and great fix, thank you!
Are you able to confirm that calling e.g. torchcodec/src/torchcodec/decoders/_core/VideoDecoder.cpp Lines 905 to 916 in 85c1ac0
|
I remember now that I had documented that potential workaround in code comments. I opened #433 to keep track of it. |
Yup, confirmed. |
We've encountered runtime errors with some videos where the resolution in the stream metadata disagrees with the resolution of the raw decoded frame. Because we have been initializing filtergraph so early, we've had to rely on the stream metadata. This PR does the initialization laziily, so we can use the resolution of the raw decoded frame.
Note that we still rely on the stream metadata for the batch APIs - they can run into the same problem. Dealing with that case will be more involved, since we'll need to do a sacrificial decode to discover what the size of the tensors should be. We should create an issue to track that case.
Note that I did consider just obeying whatever the stream metadata was. However, for swscale, we're also using the raw decoded frame to determine what the width should be. We should maintain the same behavior no matter which method we're using for the color space conversion.
This change seems to be performance neutral. Running:
I get:
Which compares favorably with what I reported in #431.