Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mediaTime : The media presentation timestamp (PTS) in seconds of the frame presented #59

Closed
esukmonica opened this issue Aug 13, 2020 · 20 comments

Comments

@esukmonica
Copy link

I would like to ask whether the mediatime I received in the requestVideFrameCallback represents the current frame being displayed?

I have a video with a known start time code and the video frame rate, and I am trying to work out the current SMPTE using the mediatime data.

It works really well until i get to roughly half way on the video (and it happens at this particular time). The mediatime being passed appears to drift in relation to the frame it is being displayed.

Thanks

@dalecurtis
Copy link
Collaborator

Yes the media time in the callback should map exactly to the frame's timestamp. There are some rare cases where we do rewrite timestamps, but generally the media time should be exactly what the demuxer emits. If you have a sample to look at we can investigate if you think this isn't working right.

@tguilbert-google
Copy link
Member

tguilbert-google commented Aug 13, 2020

To expand with details and capture more information for future users of the API:

The mediaTime corresponds to the presentation timestamp of latest frame sent to the browser compositor, and we get the timestamp from the media itself. The rvfc callback can be 1 v-sync late, so there can be a small window where the last mediaTime you received does not correspond exactly to what is on screen. However, drift issues of more than 1 (maybe 2?) frames might come from the media's encoding itself.

Let's say you have a something that looks like this:

<body>
  <video/>
  <span id="timestamp"/>
</body>
<script>
  ...
  video.requestVideoFrameCallback((t, metadata) => { timestamp.innerText = "Time: " + metadata.mediaTime; })
  ...
</script>

If the callback is in sync with the frame presented you would have:

vsync # 1 2 3
Video frame sent A (33 ms) B (66 ms)
Callback received ...
expectedDisplayTime == 2
...
What is on screen Frame A
Time: 33
Frame A
Time: 33

If the callback is 1 v-sync behind the frame presented you would have:

vsync # 1 2 3
Video frame sent A (33ms) B (66 ms)
Callback received ...
expectedDisplayTime == 2
...
What is on screen Frame A Frame A
Time: 33

You can check whether the frame is already on screen, or is about to be on screen, using the expectedDisplayTime and seeing if it's approximately now (within 10 microseconds of performance.now() let's say), or roughly 1 v-sync in the future (~16ms for a 60 hz monitor).

@esukmonica
Copy link
Author

Thank you for your responses. I was out by 10 frames, and as you pointed, it was the encoding. It is working really well!

@lincolnneu
Copy link

lincolnneu commented Sep 27, 2020

roughly 1 v-sync in the future (~16ms for a 60 hz monitor)

Does chrome have a javascript api to let us know what the refresh rate of monitor is using? Without this information we can't decide if 1 v-sync late happens programmatically.

@lincolnneu
Copy link

drift issues of more than 1 (maybe 2?) frames might come from the media's encoding itself.

Curious, what kind of encoding may cause this drift? H.264 with B frames? How can we check if the timestamp used by the encoding is correct? Will such kind of video be played correctly?

@dalecurtis
Copy link
Collaborator

The video encoding has no impact on this API. The frame delay is related to the compositor painting before the frame callback can be delivered to the main thread. The API tries to tell you what frame will be painted, but sometimes the notification goes out too later since the operation happens across multiple threads.

@tguilbert-google
Copy link
Member

Does chrome have a javascript api to let us know what the refresh rate of monitor is using? Without this information we can't decide if 1 v-sync late happens programmatically.

You do not need to know what the display rate is (although you could calculate it by taking the median of times between window.rAF calls).
Instead, you can check if the callback is late programmatically by comparing if now and metadata.expectedDisplayTime are the same (say within 100 microseconds) because that frame is already on screen. If now is ahead by milliseconds or tens of milliseconds, the callback is in-sync with the frame.

@lincolnneu
Copy link

You do not need to know what the display rate is

I'm trying to determine the exact threshold for milliseconds or tens of milliseconds. From the doc, I guess threshold = 1 / monitor refresh rate

If now is ahead by milliseconds or tens of milliseconds, the callback is in-sync with the frame.

if (metadata.expectedDisplayTime - now > threshold) {
 // No v-sync late
}

Can we use threshold = 16ms for all cases, no matter what refresh rate of the monitor is? e.g. 30 Hz, 240 Hz

if (metadata.expectedDisplayTime - now <= 0.000100 second) {
 // Must be 1 v-sync late
}

What is the state between 0.000100 second to 16ms? Uncertain?

@tguilbert-google
Copy link
Member

I would try threshold = 3ms (if a 240Hz display has a period of 4ms) and it should work for all cases. I don't have access to a 240Hz monitor to test this on however.

I would consider every other case as late.

@lincolnneu
Copy link

lincolnneu commented Nov 16, 2020

In #66 and #65 we discussed

There is a chance that you will get newer pixel data if the frame is updated on the compositor thread, after acquiring the frame metadata for video.rVFC

Does it conflict with the conclusion made in this thread:

Yes the media time in the callback should map exactly to the frame's timestamp

More specifically, is the following case possible? Note when we get rVFC callback for frame A, at expectedDisplayTime of frame A, frame C is on screen.

vsync # 1 2 3
Video frame sent A (33ms)   B (66 ms)
Callback received   ...expectedDisplayTime == 2...  
What is on screen   Frame C Frame A Time: 33

@tguilbert-google
Copy link
Member

Yes the media time in the callback should map exactly to the frame's timestamp

Yes that is a conflict, and you should disregard the above statement. It seems like you are after 100% certainty, and the above should be true most of the time, but not 100% of the time.

As for your example, are you asking if the 3rd frame can show up on screen before the 1st frame does? Not really.
Or if the 1st frame can show up later than its expected display time? I think that it would be unlikely for it to show up late, but not impossible... Once the callback is fired, the frame has already been sent to the compositor, but something could go wrong with the whole browser or OS freezing and the frame could never show up, and we would end up with frame D+ displayed on screen. The browser could also delay or omit a VSYNC, but I don't know if a frame already send would still make it onscreen. I think it's more likely for the callback with a metadata.ExpectedDisplayTime of T to be fired at T+1.

@lincolnneu
Copy link

Oh, seems like there's disagreement in this topic. Is this statement true

When paused you're always going to get the right frame callback after a seek.
Correct, pause is unaffected since only 1 frame is ever rendered in the pause case (until play() is called anyways).

In #69 ?

@tguilbert-google
Copy link
Member

If you are paused, and the pause has completed and the internal state stabilized, and then you seek, that statement is true.

@lincolnneu
Copy link

lincolnneu commented Nov 18, 2020

In an onpaused event we keep the currentTime as targetTime, and I'll seek to an adjacent time. In the next onSeeked event, I seek back to the targetTime.

In this way I can guarantee the PTS time I have is the latest?

@tguilbert-google
Copy link
Member

I'm going to say probably, but you would have to try it out yourself to see if that algorithm works. You have to seek far enough to cause new frames to show up, which might be visible to the user.

@lincolnneu
Copy link

lincolnneu commented Nov 18, 2020

My videos are all constant frame rate so we can know the enough length to seek to trigger a rvfc/new frame to show up.
To double check, I'll also add a lastSeekTime < metaData.presentationTime check for time calibration.

(Accuracy is more important. We can sacrifice a little bit of user experience for this.)

@lincolnneu
Copy link

lincolnneu commented Nov 18, 2020

the pause has completed and the internal state stabilized

Is there a programmatic way to know this state? onpause does not mean this state?
According to doc of pause event:

The event is sent once the pause() method returns and after the media element's paused property has been changed to true.

Is it possible that internal state is not stabilized even at this time?

@lincolnneu
Copy link

lincolnneu commented Nov 18, 2020

Actually I'm wondering why do we need to care about the pause has completed and the internal state stabilized?

The seek action is kicked off in onPaused event, and when I get the onSeeked event, it must be after the completion of pause (and whether the internal state is stable for that pause no longer matters to us, since we are at another stage, the seek after pause). And then we seek back to the time when we pause, and the rvfc of this second seek operation is what we care about.

@tguilbert-google
Copy link
Member

Is it possible that internal state is not stabilized even at this time?

I'm not 100% sure, but I assume that the pause can complete (e.g. currentTime will no longer advance), and then there is a vsync that updates the current frame which matched the paused timestamp. Perhaps there could be a callback where metadata.presentationTime is newer than when the onPaused event was fired.

Actually I'm wondering why do we need to care about ...

You're probably right that the two seeks remove the need to care about the internal state being stable. Just keep in mind that onSeeked events are fired as tasks and rVFC is fired in the rendering steps. If you chain onPaused->onSeeked->onSeeked between two vsyncs, you might get 1 rVFC (with a theoretical possibility for a frame/metadata mismatch), but if they are spread out over multiple v-syncs, you might get 3 rVFCs.

@lincolnneu
Copy link

To avoid mismatch, we need to make sure the following steps happen sequentially:

onPaused->seek->rVFC->onSeeked->seek->rVFC->onSeeked

This can be examined programmatically by checking latestSeekTime <= metadata.presentationTime at each onSeeked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants