Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Clarification on FDC3 Web Communication Flow and Resilience #1457

Closed
Davidhanson90 opened this issue Dec 4, 2024 · 12 comments
Closed
Labels
question Further information is requested

Comments

@Davidhanson90
Copy link

Davidhanson90 commented Dec 4, 2024

Question Area

  • [X ] Other

Question

I am currently exploring the FDC3 Web implementation and have some concerns regarding the communication flow and its resilience, particularly when the root application is closed. Here's a breakdown of the scenario and my understanding:

Current Workflow:

Root Application: Accessed via myrootdomain.com, acts as the primary container and root agent for all communications.
Child Applications: Two applications can be launched from the root:
ChildA: childADomain.com
ChildB: childBDomain.com

Communication Process:

  • Actions performed in ChildA trigger an intent that is managed by the root application. ChildB can respond to this intent, with the communication being routed through the root.

Issue Encountered:

Transient Nature of Web Apps: If the root application (myrootdomain.com) is closed, which is a common scenario given the transient nature of web applications, any subsequent attempts to perform the same action from ChildA fail. This is because the root, which acts as the intermediary, is no longer available to route communications.

User Experience: From a user's perspective, this results in an error or a failure in functionality that previously worked, leading to confusion and a poor user experience.

Questions:

  1. Is my understanding correct that this dependency on the root application is by design for FDC3 Web?
  2. Has there been any consideration for implementing a more peer-to-peer (P2P) based protocol to enhance resilience and ensure functionality remains consistent, even if the root application is closed?
  3. Is there existing guidance on how these UX related problems should be tackled

Suggested Enhancement:

Consider P2P Communication: This could potentially allow direct communication between ChildA and ChildB without dependency on the root application, thus improving resilience and user experience.

Service workers Possibly leverage service workers for routing?

I have done no research on these suggestions so take with a pinch of salt. I am looking to the community for insights on this and any potential roadmap towards enhancing the communication framework within FDC3 Web.

@Davidhanson90 Davidhanson90 added the question Further information is requested label Dec 4, 2024
@Davidhanson90 Davidhanson90 changed the title Question: Clarification and Enhancement Request on FDC3 Web Communication Flow and Resilience Question: Clarification on FDC3 Web Communication Flow and Resilience Dec 4, 2024
@kriswest
Copy link
Contributor

kriswest commented Dec 4, 2024

Hi @Davidhanson90

Is my understanding correct that this dependency on the root application is by design for FDC3 Web?

To a certain extent, yes. FDC3 is based on the concept of a Desktop Agent that facilitates communication, rather than peer-to-peer communication between applications. The agent concept is useful for actions like launching and intent resolution (inc. launching apps to resolve intents) as it's a single entity that is aware of what is available. It's also helpful where you have apps from a diverse set of vendors as each only has to deal with connecting to and dealing with an agent rather than many peers, resulting in fewer conversations about who each is willing to interoperate with and (perhaps) better loose coupling.

Has there been any consideration for implementing a more peer-to-peer (P2P) based protocol to enhance resilience and ensure functionality remains consistent, even if the root application is closed?
Is there existing guidance on how these UX-related problems should be tackled

There was indeed consideration of how you might build a more resilient agent - particularly based on a Shared Worker. When connecting to a DA the connection flow allows you to either set up a MessagePort with a parent window or frame OR for the parent to give you a URL for an adaptor to load into an iframe, to be communicated with via a MessagePort again using the same protocol. In the early conversations, we were expecting the iframe approach to be used by implementations with a SharedWorker. The SharedWorker would be the Desktop Agent and hence would persist while any window remained connected to it - ideal! However, during the process, the Chrome Security team and others started preventing cross-domain iframes (iframes embedded into a window from another domain) from sharing workers, throwing a significant spanner into the works.

However, the adaptor approach remains and it may still be possible to build a more resilient agent using a service worker... Albeit not with the Broadcast channel API as that too was restricted in cross-domain iframes (see the note on https://developer.mozilla.org/en-US/docs/Web/API/Broadcast_Channel_API). It certainly is possible with a service or websocket (the iframes can communicate with that even if they can't communicate with each other) that is outside the browsers scope (i.e. some local executable or remote webservice). However, information on (for example) trading activity passing through the web and a third-party webservice is not always desirable/allowed, where local interop may be. One other approach I've mused on is using the page lifecycle API to catch the parent being closed and to turn over operations (and state) to one of the adaptors loaded into another app and have it act as the DA - this would rely on the ability to re-ship MessagePorts received from other apps using postMessage (something I haven't tested ad can't find anything in the HTML spec saying it is or is not supported: https://html.spec.whatwg.org/multipage/web-messaging.html#message-ports).

The fact that we have to handle cross-domain communication is what makes this difficult. window.postMessage and MessageChannels are supposed to be the best way to achieve that, but you still have to handle discovery (of other apps or the agent) and launching. Discovery is easy with a parent window or frame (enumerated frames, parent window.frame references) and launching/intent resolution is easy if one entity knows what can be launched and what is already running, which ties back to the centralized Desktop Agent concept...

Happy to pop this on the agenda for the next web browsers meeting for further discussion or to keep exchanging idea of how more resilient DAs could be built. It would definitely take some thought to come up with a viable option. On the otherhand, you can intercept window closing and warn the user what the effect of closing the parent window will be (and perhaps even the identify what apps would be affected). We don't actually have a notification to apps when the DA goes away in the connection protocol (interop will just stop/hang) although there is one for apps closing and letting the DA know (+ heartbeats).

@Davidhanson90
Copy link
Author

Thank you for your detailed response. I understand there may be several challenges to implementing this, but I'd like to propose a potential solution based on the scenario you described.

Consider the following setup:

Root Application: Accessed via myrootdomain.com, this serves as the primary hub and central coordinator for all communications.

Child Applications: Two applications that can be initiated from the root application.

I'm curious about the implications of having each child application embed an iframe linked to the root application. This approach would allow each child to maintain a local version of the root agent, which remains operational even if the parent application is closed.

Moving forward with this idea, each embedded root agent could synchronize its state across various nodes using the Broadcast Channel API or a similar mechanism. Alternatively, the state could be stored in local storage. The advantage here is that since the embedded root agents are all on the same domain, data transfer between them is simplified.

By continuously spawning nodes in this network, each node would incorporate the original root agent and maintain synchronized state information. This could effectively eliminate the need for a middleman in data synchronization and communication processes.

@kriswest
Copy link
Contributor

kriswest commented Dec 4, 2024

HI @Davidhanson90,

Root Application: Accessed via myrootdomain.com, this serves as the primary hub and central coordinator for all communications.

i.e. the DesktopAgent

I'm curious about the implications of having each child application embed an iframe linked to the root application. This approach would allow each child to maintain a local version of the root agent, which remains operational even if the parent application is closed.

This is facilitated by the adopted proposal via the WCP2LoadURL message in the Web Connection Protocol.

Moving forward with this idea, each embedded root agent could synchronize its state across various nodes using the Broadcast Channel API or a similar mechanism. Alternatively, the state could be stored in local storage. The advantage here is that since the embedded root agents are all on the same domain, data transfer between them is simplified.

Yes this is exactly what we wanted to achieve. However, there are issues where the child applications are not on the same domain as the iframe. In that situation, the iframes are prevented from using BroadcastChannel to iframes pointing at the same URL but embedded in an app from a different domain.

Note: To be exact, communication is allowed between browsing contexts using the same storage partition. Storage is first partitioned according to top-level sites—so for example, if you have one opened page at a.com that embeds an iframe from b.com, and another page opened to b.com, then the iframe cannot communicate with the second page despite them being technically same-origin. However, if the first page is also on b.com, then the iframe can communicate with the second page.

MDN: https://developer.mozilla.org/en-US/docs/Web/API/Broadcast_Channel_API

I'm not currently aware of another way for the two iframes (pointing to same URL) embedded in pages from different domains to communicate with each other, other than via a server of some sort outside the browser (whether running locally or remote). AFAIK the relevant HTML Specification sections (e.g. https://html.spec.whatwg.org/multipage/web-messaging.html#broadcasting-to-other-browsing-contexts) don't mention the restriction on cross-domain iframes - but they are implemented by Chrome and Firefox... This was relatively recent (last year or two) and was introduced for security reasons (such communication is probably used to coordinate attacks of some sort).

Here is the announcement/work item: https://developers.google.com/privacy-sandbox/cookies/storage-partitioning?_gl=1*115ffup*_up*MQ..*_ga*MzczNDI1ODkyLjE3MzMzMjUwNzc.*_ga_JPRHSQDH0G*MTczMzMyNTA3Ny4xLjAuMTczMzMyNTA3Ny4wLjAuMA..
They ran an origin trial on this that you could opt out of - but I think it ended in September and storage partitioning is now rolled out in current releases.

@Davidhanson90
Copy link
Author

This is interesting. I am amazed this hasn't broken a whole range of applications with a change like this.

@kriswest
Copy link
Contributor

kriswest commented Dec 4, 2024

This is interesting. I am amazed this hasn't broken a whole range of applications with a change like this.

I suspect it did and the 'Origin trial' where you could fill in a form and get it disabled for your site, while you went about a redesign, was a clever bit of handling for the deprecation of unpartitioned storage...

@Roaders
Copy link
Contributor

Roaders commented Dec 6, 2024

So in this case @kriswest I can't see any way that iframes embedded in a proxy window will be able to commuinicate with each other if the DA returns an iframe url rather than a message channel. Unless they are all on the same domain of course.

@kriswest
Copy link
Contributor

kriswest commented Dec 6, 2024

@Roaders yes I believe that is the case (unless they conduct that communication over a channel outside the browser scope, e.g. over a websocket server).

However, if your DA always loads apps into iframes (which comes with its own CSP-based challenges) then the parent windows can of course communicate with each other via broadcast channel. If any window is able to take over as the agent, the agent is distributed or implemented via a SharedWorker in the parent window, then it would stay alive until the last one closes. Thats not a perfect solution (due to the app's CSP needing to allow it t be embedded in an iframe), but might be a more resilient approach

@kriswest
Copy link
Contributor

@Davidhanson90 @Roaders should we close this issue? If you see another way forward I'd be very happy to discuss it, but based on my current understanding of the browser APIs available the existing FDC3 for Web proposal/PR supports the best available solutions.

@Davidhanson90
Copy link
Author

Funny enough I was thinking about this over the weekend. I find this gap problematic for widescale browser adoption.

I haven't read too deep into yet but wondered if you had seen this https://developers.google.com/privacy-sandbox/private-advertising/shared-storage

@kriswest
Copy link
Contributor

I have not read on Shared Storage extensively yet, but based on a quick scan I don't think what we want to do fits within its proposed use cases nor capabilities. One immediate problem I see is that there are no events proposed making it hard to use for communication (rather than storage) - although they may extend the proposal later to include these (see https://github.com/WICG/shared-storage?tab=readme-ov-file#possibilities-for-extension). However, using shared storage for communication between an undefined number of apps would still be complex to achieve in a robust way. Shared storage might be useful for persisting connection detail, however, we have a workable solution for that already.

I may be missing another way that this can be used to solve the issue - if so please do point me in the right direction as I'd love to find a more robust solution within the browser.

At the moment I still only see two ways to make FDC3 for the Web robust to a parent window being closed:

  1. Wrap all applications (as iframes) in a parent window. The parent windows will all be on the same origin and can therefore use communication APIs/shared workers (that we can't use cross-origin/in iframes hosted in cross-origin windows) - but requires that the app's CSP allows it to be embedded in an ifrmae.
  2. Use an iframe adaptor (loaded into an iframe via WCP2LoadUrl) and a server (either local or remote - just needs to be outside the scope of the browser) to relay communications and/or act as the DA.

I/we are aware of Desktop Agents (and similar solutions) that use the above approaches.

@Davidhanson90
Copy link
Author

ok lets close for now and we can revaluate if we find alternatives.

@kriswest
Copy link
Contributor

Ok - definitely keen to discuss if you or other participants come up with something else! It is unfortunate that cross-origin comms/coordination has to be so limited due to its use in coordinating attacks 😞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants