-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC3819: Allowing widgets to send/receive to-device messages #3819
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,248 @@ | ||
# MSC3819: Allowing widgets to send/receive to-device messages | ||
|
||
Widgets (embedded HTML applications in Matrix) currently have a relatively large surface area | ||
they can use for interacting with their attached client, primarily in the context of a room. They | ||
can [send/receive events with MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762), | ||
[navigate to rooms with MSC2931](https://github.com/matrix-org/matrix-spec-proposals/pull/2931), | ||
and even [open dialogs with MSC2790](https://github.com/matrix-org/matrix-spec-proposals/pull/2790), | ||
but they can't act as a whole other Matrix client just yet. | ||
|
||
This MSC forms part of a larger, ongoing, question about how to embed other Matrix clients into another | ||
client or room for access. An increasingly more popular client development option is to build out an | ||
entirely new Matrix client and want to embed that within another client (as a widget) to avoid the | ||
user needing to switch apps. To support this, we need to consider both long term and short term impact | ||
of the changes we propose. This MSC aims closer to the short term. | ||
|
||
A longer term solution to the problem of clients wanting to be embedded in other clients might still | ||
be widgets, though with a system like [MSC3008](https://github.com/matrix-org/matrix-spec-proposals/pull/3008) | ||
to restrict access to the client-server API more effectively. For this MSC's purpose though, we're | ||
aiming to cover a specific subset of the client-server API: to-device messages. | ||
|
||
While we could expose the entire client-server API over `postMessage` (or similar) for embedded | ||
clients to access, the permissions model gets hairy and difficult to secure on the client side. Instead, | ||
we're exploring what it would look like to special case what is needed for specific applications, as | ||
needed, starting with to-device messages. | ||
|
||
To-device messaging is described [here](https://spec.matrix.org/v1.2/client-server-api/#send-to-device-messaging) | ||
with practical applications for widget-ized clients being implementations of | ||
[MSC3401 - Native group VoIP](https://github.com/matrix-org/matrix-spec-proposals/pull/3401) for now. | ||
|
||
## Prerequisite background | ||
|
||
*Author's note: This is copied from [MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762).* | ||
|
||
Widgets are relatively new to Matrix and so the terminology and behaviour might not be known to all | ||
readers. This section should clarify the components of widgets that are applicable to this MSC without | ||
going on a deep dive into widgets in general. | ||
|
||
Widgets are embedded HTML/JS/CSS applications in a client which use the `postMessage` API to talk | ||
to the client. This communication allows widgets to provide enhanced functionality such as sticker | ||
pickers (when applied to a user) or performance dashboards (in rooms). | ||
|
||
One of the first things that happens over this communication channel is a "capabilities negotiation" | ||
where the client asks the widget what permissions it wants, and the widget replies with its ideal | ||
set. The client then either decides or asks the user if the permissions requested are okay. | ||
|
||
All communication over the channel is done in a simple request/response flow, using actions to | ||
describe the request. For the capabilities negotiation, this would be the client sending the widget | ||
a request with an `action` of `capabilities`, and the widget would respond to that request with a | ||
response object. | ||
|
||
The channel in which communication occurs is called a "session", where the session is "established" | ||
after the capabilities negotiation. Sessions can only be terminated by the client. | ||
|
||
The Widget API is split into two parts: `toWidget` (client->widget) and `fromWidget` (widget->client). | ||
They are differentiated by where the request originates. | ||
|
||
## Proposal | ||
|
||
Inspired heavily from [MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762), we | ||
introduce new capabilities for to-device messages: | ||
|
||
* `m.send.to_device:<event type>` (eg: `m.send.to_device:m.call.invite`) - Used for sending to-device | ||
messages of a given type. | ||
* `m.receive.to_device:<event type>` (eg: `m.receive.to_device:m.call.invite`) - Used for receiving | ||
to-device messages of a given type. | ||
|
||
These capabilities open up access to the following respective actions, when approved: | ||
|
||
**`fromWidget` action of `send_to_device`** | ||
|
||
```json5 | ||
{ | ||
// This is a standardized widget API request. | ||
"api": "fromWidget", | ||
"widgetId": "20200827_WidgetExample", | ||
"requestid": "generated-id-1234", | ||
"action": "send_to_device", // value defined by this proposal | ||
"data": { | ||
// Same structure as the `/sendToDevice` HTTP API request body | ||
"@target:example.org": { | ||
"DEVICEID": { // can also be a `*` to denote "all of the user's devices" | ||
"example_content": "put your real message here" | ||
} | ||
} | ||
} | ||
} | ||
``` | ||
|
||
The client upon receipt of this will validate that the widget has an appropriate capability to send | ||
the to-device message. If the widget is approved for such a capability, the client **MUST** encrypt | ||
the message by default unless the event is already encrypted by the widget (this MSC doesn't provide | ||
enough API surface for a widget to do this, but in future it might be possible for the widget to | ||
gain some context of the encryption state for the client and use that to make/manage Olm sessions). | ||
The encrypted message is then sent as requested to the users/devices using | ||
[`/sendToDevice`](https://spec.matrix.org/v1.2/client-server-api/#put_matrixclientv3sendtodeviceeventtypetxnid). | ||
|
||
If the widget doesn't have appropriate permission, or an error occurs anywhere along the send path, | ||
a standardized widget error response is returned. | ||
|
||
Under the widget API, a response to all actions is required and takes the shape of repeating the | ||
request with an added top-level `response` field. This `response` field is empty for this action, | ||
as shown: | ||
|
||
```json5 | ||
{ | ||
"api": "fromWidget", | ||
"widgetId": "20200827_WidgetExample", | ||
"requestid": "generated-id-1234", | ||
"action": "send_to_device", | ||
"data": { | ||
"@target:example.org": { | ||
"DEVICEID": { | ||
"example_content": "put your real message here" | ||
} | ||
} | ||
}, | ||
"response": {} | ||
} | ||
``` | ||
|
||
The client *should not* send a response to the action until the server has returned 200 OK itself, | ||
which might take longer than the default widget API timeout of 10 seconds. Widgets should raise their | ||
maximum timeout to 60 seconds or more for this action. | ||
|
||
**`toWidget` action of `send_to_device`** | ||
|
||
*Note*: It is common practice to name the action in favour of the direction of travel rather than try | ||
and determine an alternative name. This does mean that there are two `send_to_device` actions: one | ||
for widget->client and one for client->widget. This section is talking about client->widget. | ||
|
||
After the client has decrypted all to-device messages it receives, it determines if any widgets should | ||
be made aware of the contents within. The decrypted event type for the message is used to determine | ||
if the widget has appropriate capability to see the message. | ||
|
||
The client should process all to-device messages it can before sending them off to the widget. Even if | ||
the client does process a message though, it should still send it to the appropriate widgets for | ||
potential re-processing. This is to avoid a scenario where the host client can no longer reliably | ||
function, such as if Olm sessions get corrupted or similar. | ||
|
||
The client should be aware that to-device messages might be seen which the client *could* handle, but | ||
might not have context on, such as VoIP signaling. The client should not error out if it can't locate | ||
a matching call, for example. | ||
|
||
The client SHOULD only send events which were received by the client *after* the session has been | ||
established with the widget (after the widget's capabilities are negotiated). | ||
|
||
The request itself looks as follows: | ||
|
||
```json5 | ||
{ | ||
// This is a standardized widget API request. | ||
"api": "toWidget", // note that we're sending *to* the widget here | ||
"widgetId": "20200827_WidgetExample", | ||
"requestid": "generated-id-1234", | ||
"action": "send_to_device", // value defined by this proposal | ||
"data": { | ||
"type": "m.call.invite", | ||
"sender": "@source:example.org", | ||
"encrypted": true, | ||
"content": { | ||
// ... as required for the event schema | ||
} | ||
} | ||
} | ||
``` | ||
|
||
Note that the action only supports a single to-device message at a time. This is for symmetry with | ||
[MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762). | ||
|
||
Under the widget API, a response is required from the widget. The widget simply acknowledges the request | ||
with an empty response object: | ||
|
||
```json5 | ||
{ | ||
"api": "toWidget", | ||
"widgetId": "20200827_WidgetExample", | ||
"requestid": "generated-id-1234", | ||
"action": "send_to_device", | ||
"data": { | ||
"type": "m.call.invite", | ||
"sender": "@source:example.org", | ||
"content": { | ||
// ... as required for the event schema | ||
} | ||
}, | ||
"response": {} | ||
} | ||
``` | ||
|
||
## Potential issues | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While working with the current support of MSC3819 in Element we noticed a problem: The send endpoint of the toDevice message API allows to target specific device Element Call, which this MSC initially was created for, is cheating a little bit. Therefore I suggest to amend small section here:
And extend the unstable prefix section with:
If this extension is desired, I'm open to add the implementation to the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I started working on implementing this addition over at matrix-org/matrix-widget-api#78 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In matrix-org/matrix-widget-api#78, we decided to slightly rename the parameter. Thus the amendment should be as follows:
And extend the unstable prefix section with:
|
||
|
||
Due to lack of documentation/spec, conventions for the widget API and its security principles could | ||
be misunderstood or confusing. This MSC attempts to overly describe these cases where they are at | ||
risk of being a potential misunderstanding, however readers of the proposal are still encouraged to | ||
gather as much information as they can before reviewing this proposal. | ||
|
||
This MSC further pushes forward an idea that the `postMessage` transport for the widget API is the | ||
way to go, however MSCs like [MSC3009](https://github.com/matrix-org/matrix-spec-proposals/pull/3009) | ||
explore what it could mean to have a different transport mechanism. This MSC is not tied directly | ||
to `postMessage` and is instead describing the request/responses used over the widget API - whatever | ||
transport that might be. | ||
|
||
## Alternatives | ||
|
||
As discussed in the introduction of this proposal and on other MSCs, we could expose the client-server | ||
API more generically to the widget. This causes issues where the client is either forced to parse | ||
requests like a webserver would to validate that the widget is allowed to make the request, or | ||
require such a generic capability that widgets would excessively request full read/write access | ||
from the user without consideration for the impact that might have. As such, we continue to describe | ||
special-cased actions for the widget API on a case-by-case basis. | ||
|
||
On other related proposals there's discussion about how a bot could achieve the same function as | ||
the proposal. While also partially true here, the intent is not to have a game or similar publishing | ||
events into a room but rather to have a second Matrix client (for all intents and purposes) embedded | ||
either as a room widget or account widget. A bot precludes the second client from acting on behalf | ||
of the user who has it open. | ||
|
||
## Security considerations | ||
|
||
Because the widget can implicitly decrypt events, it is absolutely imperative that clients | ||
prompt for permission to use these capabilities even though the capabilities negotiation does not | ||
require this to be done. Strictly speaking, clients which do not prompt for confirmation from the | ||
user are frowned upon, however given the intended usecase of VoIP signaling it is reasonable to | ||
auto-approve some capabilities if the client can verifiably trust the widget is running safe code. | ||
In general, verifiable trust only comes from the client locking widgets down to specific domains | ||
or rewriting the widget URL before rendering to something the client controls. | ||
|
||
This MSC allows widgets to access sensitive parts of the client-server API, and the encryption | ||
module specifically. If granted permission, a widget could feasibly harvest decryption keys *in clear | ||
text*. It is strongly encouraged that clients do not auto-approve capabilities for key exchanges | ||
or similar. In fact, it might even be reasonable for the client to auto-deny instead. | ||
|
||
This MSC allows a room widget to act at the account level rather than the traditional room level. | ||
Normally these events would be scoped to the currently active room, however to-device messages are | ||
not tied to a room. Therefore, the events are exposed as-is to the widget and can be interacted with | ||
as such. | ||
|
||
## Unstable prefix | ||
|
||
While this MSC is not present in the spec, clients and widgets should: | ||
|
||
* Use `org.matrix.msc3819.` in place of `m.` in all new identifiers of this MSC. | ||
* Only call/support the `action`s if a widget API version of `org.matrix.msc3819` is advertised. | ||
|
||
## Dependencies | ||
|
||
None applicable - this MSC's dependencies have either been approved or are used simply as reference | ||
material. In practice, widgets should probably be formally in the spec before this MSC gets included. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to include the event type somewhere in here. I'd suggest an interface of
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to control whether the event is encrypted or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed an issue with the current implementation. The payload (
unknown
above) varies betweenencrypted: true
andencrypted: false
in Element Web right now. For unencrypted messages it is directly the content of the message, like{"key": "value"}
, but for encrypted events it has to be wrapped like{"type": "event type…", "content": {"key": "value"} }
. If that is really desired and not just a mistake, we should make sure to document the behavior.I created an issue for it: element-hq/element-web#24470