Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to interact with a device without a fixed capability? #1027

Open
Tackoil opened this issue Jan 13, 2025 · 7 comments
Open

How to interact with a device without a fixed capability? #1027

Tackoil opened this issue Jan 13, 2025 · 7 comments

Comments

@Tackoil
Copy link

Tackoil commented Jan 13, 2025

Some of the devices, especially virtual camera, have variable capabilities. For example, virtual cameras can set their resolution almost arbitrary. So that, their capabilities will also be variable.

In this situation, how to interact with them using Web APIs? As an example, the following code will match a virtual camera and get the video stream from it?

const stream = await navigator.mediaDevices.getUserMedia({
    video: {
        width: 1920, height: 1080
    }
})

FYI, it seems that some virtual cameras software cannot set their device capabilities limited by OS. obsproject/obs-studio#10263 (comment)

@guidou
Copy link
Contributor

guidou commented Jan 14, 2025

In this case, the capabilities should include every possible potential value.
For range-based capabilities, if the ranges for all potential configurations are known, use the corresponding maximum and minimum values. If they are not known, then theoretical min/max values can be used.
Similar for discrete capabilities, but enumerating all potentially supported values.

@Tackoil
Copy link
Author

Tackoil commented Jan 15, 2025

Thank you for your reply. But I think there is still some strange thing to be discussed.

Virtual cameras will usually not change their output resolution following the request from browsers or some other software. (Maybe limited by OS, I think.) Only the user of virtual cameras may change it as they desired.

If the virtual cameras claim their capability with a huge range of width/height, following the rules of If they are not known, then theoretical min/max values can be used. As a fact, they may not deliver their promises unless matching the user setting coincidently. The following figure may describe it intuitively.

Image

At the position of question mark in figure, the user agent may have two solutions in my mind.

  1. Stretch the video stream to fit the initial settings, but the web user will get a stream with unexpected image.
  2. Destroy all initial settings to match the new resolution, but the web user will get a stream with unexpected resolution.

I'm not quite sure which is better, or is there any better solution for this situation?

@alvestrand
Copy link
Contributor

The issues here seem very similar to those faced by tab capture (getDisplayMedia), and should probably be solved in a similar way.
It's been a traditional constraint in webrtc that we never upscale, and never change aspect ratios. Keeping those in mind is probably good.

Behavior should be different if the first constraint is "exact: 1920x1080" or "ideal" (non-labelled) 1920x1080 - in the latter case, the UA is free to choose the "closest available resolution".

@jan-ivar
Copy link
Member

Our model says "Source capabilities are effectively constant. Applications should be able to depend on a specific source having the same capabilities for any browsing session."

However, that text may be a bit outdated: As @alvestrand mentions, the adjacent getDisplayMedia (following the same model) says of e.g. width "As a capability, max MUST reflect the display surface's width" which in practical terms when capturing a window in Safari updates live in response to the user resizing the captured window (though not Chrome).

In the example you give, I'd say once a user locks their virtual device to 1080x1920 output, I think a case can be made that getCapabilities() from then on should read (0, 1080) for width, not (0, 65536). "Media MUST NOT be upscaled".

But if this requires page reload to take effect, that seems fine to me. After all, this API was designed for cameras; virtual cameras pretend to be cameras, which means adhering to certain camera abstractions and limitations.

@jan-ivar
Copy link
Member

Also note on rotatable OSes (phones), vertical resolutions are typically expressed in landscape. Might not apply here, just FYI.

@Tackoil
Copy link
Author

Tackoil commented Jan 21, 2025

Thank you for all your reply, and I try to make a conclusion about the discussions above:

For getCapabilities() of the virtual device, the max width and height are always following the actual resolution is a better solution. Virtual devices are designed to pretend to be real cameras.

In my opinion, the conclusion is reasonable and I quite agree with it.

However, the difficulty which the developer of virtual camera software facing is still existing. As for Camera Extension APIs of macOS, there is no way to set the initial resolution (1080x1920 in that example) enumerated by User Agents. I think it would be better to send a feedback towards Apple to resolve this issue.

@alvestrand
Copy link
Contributor

Some of this ambiguity is covered in the definition of capability as being "the most optimistic view" - even if a value is within capabilities' ranges, you're not sure to get it.

However, I agree with the proposal in the January slides to just admit that capabilities can change, and that they reflect current capabilities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants