From 8606e859b7422f574cdb1b4a8ec5fdf3bbec23e8 Mon Sep 17 00:00:00 2001 From: Ming-Ying Chung Date: Mon, 29 Aug 2022 17:15:15 +0900 Subject: [PATCH 1/3] Clean up main branch * Removed `index.html` from `main`. Now it's auto-generated from `index.bs` and put into `gh-pages` branch. Available at https://wicg.github.io/unload-beacon/ * Fixed broken super-linter badge. Related: #22 --- README.md | 2 +- index.html | 1884 ---------------------------------------------------- 2 files changed, 1 insertion(+), 1885 deletions(-) delete mode 100644 index.html diff --git a/README.md b/README.md index 0709476..c39f48b 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ Authors: [Darren Willis](https://github.com/darrenw), [Fergal Daly](https://github.com/fergald), [Ming-Ying Chung](https://github.com/mingyc) - Google -[![Super-Linter](https://github.com/WICG/unload-beacon/workflows/linter.yml/badge.svg)](https://github.com/WICG/unload-beacon/actions/workflows/super-linter.yml) +[![Super-Linter](https://github.com/WICG/unload-beacon/workflows/Lint%20Code%20Base/badge.svg)](https://github.com/WICG/unload-beacon/actions/workflows/super-linter.yml) [![Spec Prod](https://github.com/WICG/unload-beacon/actions/workflows/auto-publish.yml/badge.svg)](https://github.com/WICG/unload-beacon/actions/workflows/auto-publish.yml) diff --git a/index.html b/index.html deleted file mode 100644 index ba2a807..0000000 --- a/index.html +++ /dev/null @@ -1,1884 +0,0 @@ - - - - - Page Unload Beacon - - - - - - - - - - - - - - - - - -
-

-

Page Unload Beacon

-

Unofficial Proposal Draft,

-
- More details about this document -
-
-
This version: -
http://wicg.github.io/unload-beacon/ -
Issue Tracking: -
GitHub -
Inline In Spec -
Editor: -
Ian Clelland (Google) -
-
-
-
- -
-
-
-

Abstract

-

This document introduces an API for registering data to be sent to a predetermined server - - at the point that a page is unloaded.

-
-

Status of this document

-
-

This section describes the status of this document at the time of its publication. - A list of current W3C publications - and the latest revision of this technical report - can be found in the W3C technical reports index at https://www.w3.org/TR/.

-

GitHub Issues are preferred for discussion of this specification.

-

This document is governed by the 2 November 2021 W3C Process Document.

-

This document was produced by a group operating under the W3C Patent Policy. - W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; - that page also includes instructions for disclosing a patent. - An individual who has actual knowledge of a patent which the individual believes - contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

-

-
-
- -
-

1. Introduction

-

This is an introduction.

-

This introduction needs to be more of an introduction.

-

2. Pending Beacon Framework

-

2.1. Concepts

-

A pending beacon represents a piece of data which has been -registered with the user agent for later sending to an origin server.

-

A pending beacon has a url, which is a URL.

-

A pending beacon has a method, which is a -string, which is initally "POST".

-

A pending beacon has a foreground timeout, which is -either null or an integer, and which is initially null.

-

A pending beacon has a background timeout, which is -either null or an integer, and which is initially null.

-

A pending beacon has an is_pending flag, -which is a boolean, which is initially true.

-

A pending beacon has a payload, which is a byte sequence. It is initially empty.

-

A Document has a pending beacon set, which is an ordered set of pending beacons.

-

Add worker beacons as well?

-

Note: In this spec, the pending beacon set is associated with a Document. -In an actual implementation, this set will likely need to be stored in the user -agent, separate from the document itself, in order to be able to send beacons -when the document is destroyed (either by being unloaded, or because of a crash).

-

Define these to be part of the user agent formally.

-

2.2. Updating beacons

-
- To set the url of a pending beacon beacon to a URL url: -
    -
  1. -

    If beacon’s is_pending is false, return false.

    -
  2. -

    If url is not a valid URL, return false.

    -
  3. -

    If url is not a potentially trustworthy URL, return false.

    -
  4. -

    Set beacon’s url to url.

    -
  5. -

    Return true.

    -
-
-
- To set the foreground timeout of a pending beacon beacon to an integer timeout: -
    -
  1. -

    If beacon’s is_pending is false, return false.

    -
  2. -

    If timeout is negative, return false.

    -
  3. -

    Set beacon’s foreground timeout to timeout.

    -
  4. -

    Return true.

    -
-

This algorithm should also synchronously set or clear a timer to send the beacon.

-
-
- To set the background timeout of a pending beacon beacon to an integer timeout: -
    -
  1. -

    If beacon’s is_pending is false, return false.

    -
  2. -

    If timeout is negative, return false.

    -
  3. -

    Set beacon’s background timeout to timeout.

    -
  4. -

    Return true.

    -
-
-
- To set the payload of a pending beacon beacon to a byte sequence payload, -
    -
  1. -

    If beacon’s is_pending is false, return false.

    -
  2. -

    Set beacon’s payload to payload.

    -
  3. -

    Return true.

    -
-
-
- To cancel a pending beacon beacon, set beacon’s is_pending to false. -

Note: Once canceled, a pending beacon's payload will no longer be used, - and it is safe for a user agent to discard that, and to cancel any associated - timers. However, other attributes may still be read, and so this algorithm - does not destroy the beacon itself.

-
-

2.3. Sending beacons

-

Note: This is written as though Fetch were used as the underlying mechanism. -However, since these are sent out-of-band, an implementation might not use the -actual web-exposed Fetch API, and may instead use the underlying HTTP primitives -directly.

-
- To send a document’s beacons, given a Document document, run these steps: -
    -
  1. -

    For each pending beacon beacon in document’s pending beacon set,

    -
      -
    1. -

      Call send a queued pending beacon with beacon.

      -
    -
-
-
- To send a queued pending beacon beacon, run these steps: -
    -
  1. -

    If beacon’s is_pending flag is false, then return.

    -
  2. -

    Set beacon’s is_pending flag to false.

    -
  3. -

    Check permission.

    -
  4. -

    If beacon’s method is "GET", then call send a pending beacon over GET with beacon.

    -
  5. -

    Else call send a pending beacon over POST with beacon.

    -
-

"Check permission" is not defined. A specific permission should be used -here, and this should integrate with the permissions API.

-
-
- To send a pending beacon over GET, given a pending beacon beacon: -
    -
  1. -

    Let pairs be the list « ("data", beacon’s payload) ».

    -
  2. -

    Let query be the result of running the urlencoded serializer with pairs.

    -
  3. -

    Let url be a clone of beacon’s url.

    -
  4. -

    Set url’s query component to query.

    -
  5. -

    Let req be a new request initialized as follows:

    -
    -
    method -
    -

    GET

    -
    client -
    -

    The entry settings object

    -
    url -
    -

    url

    -
    credentials mode -
    -

    same-origin

    -
    -
  6. -

    Fetch req.

    -
-
-
- To send a pending beacon over POST, given a pending beacon beacon: -
    -
  1. -

    Let transmittedData be the result of serializing beacon’s payload.

    -
  2. -

    Let req be a new request initialized as follows:

    -
    -
    method -
    -

    POST

    -
    client -
    -

    The entry settings object

    -
    url -
    -

    beacon’s url

    -
    header list -
    -

    headerList

    -
    origin -
    -

    The entry settings object’s origin

    -
    keep-alive flag -
    -

    true

    -
    body -
    -

    transmittedData

    -
    mode -
    -

    cors

    -
    credentials mode -
    -

    same-origin

    -
    -
      -
    1. -

      Fetch req.

      -
    -
-

headerList is not defined.

-
-

3. Integration with HTML

-

Note: The following sections modify the [HTML] standard to enable sending of -beacons automatically by the user agent. These should be removed from this spec -as appropriate changes are made to [HTML].

-

When a document with a non-empty pending beacon set is to be discarded, send the document’s pending beacons.

-

"discarded" is not well defined.

-

When a process hosting a document with a non-empty pending beacon set crashes, send the document’s pending beacons.

-

The concepts of "process" and "crashes" are not well defined.

-
- When a Document document is to become hidden (visibility state change), run these steps: -
    -
  1. -

    For each pending beacon beacon in document’s pending beacon set,

    -
  2. -

    Let timeout be beacon’s background timeout.

    -
  3. -

    If timeout is not null, start a timer to run a task in timeout ms.

    -

    Note: The user agent may choose to coalesce multiple timers in order to send -multiple beacons at the same time.

    -
  4. -

    When the timer expires, call send a queued pending beacon with beacon.

    -

    Note: The pending beacons may have been sent before this time, in cases -where the document is unloaded, or its hosting process crashes before the -timer fires. In that case, if the user agent still reaches this step, then -the beacons will not be sent again, as their is_pending flag will be false.

    -
-

"visibility state change" should be more specific here, and should refer -to specific steps in either [PAGE-VISIBILITY] or [HTML]

-

This should also disable any foreground timers for the document’s beacons, -and there should be a step to reinstate them if the document becomes visible -again before they are sent.

-
-

4. The PendingBeacon interface

-
enum BeaconMethod {
-    "POST",
-    "GET"
-};
-
-dictionary PendingBeaconOptions {
-    unsigned long timeout;
-    unsigned long backgroundTimeout;
-};
-
-[Exposed=(Window, Worker)]
-interface PendingBeacon {
-    readonly attribute USVString url;
-    readonly attribute BeaconMethod method;
-    attribute unsigned long timeout;
-    attribute unsigned long backgroundTimeout;
-    readonly attribute boolean pending;
-
-    undefined deactivate();
-    undefined sendNow();
-};
-
-[Exposed=(Window, Worker)]
-interface PendingGetBeacon : PendingBeacon {
-    constructor(USVString url, optional PendingBeaconOptions options = {});
-
-    undefined setUrl(USVString url);
-};
-
-[Exposed=(Window, Worker)]
-interface PendingPostBeacon : PendingBeacon {
-    constructor(USVString url, optional PendingBeaconOptions options = {});
-
-    undefined setData(object data);
-};
-
-

A PendingBeacon object has an associated beacon, which is a pending beacon.

-
- The new PendingGetBeacon(url, options) constructor steps are: -
    -
  1. -

    Let beacon be a new pending beacon.

    -
  2. -

    Set this's beacon to beacon.

    -
  3. -

    Call the common beacon initialization steps with this, "GET", url and options.

    -
  4. -

    Insert beacon into the user agent’s pending beacon set.

    -
-
-
- The new PendingPostBeacon(url, options) constructor steps are: -
    -
  1. -

    Let beacon be a new pending beacon.

    -
  2. -

    Set this's beacon to beacon.

    -
  3. -

    Call the common beacon initialization steps with this, "POST", url and options.

    -
  4. -

    Insert beacon into the user agent’s pending beacon set.

    -
-
-
- The common beacon initialization steps, given a PendingBeacon pendingBeacon, a string method, a USVString url, and a PendingBeaconOptions options, are: -
    -
  1. -

    Let beacon be pendingBeacon’s beacon.

    -
  2. -

    If url is not a valid URL string, throw a TypeError.

    -
  3. -

    Let base be the entry settings object’s API base URL.

    -
  4. -

    Let parsedUrl be the result of running the URL parser on url and base.

    -
  5. -

    If parsedUrl is failure, throw a TypeError.

    -
  6. -

    If the result of setting beacon’s url to parsedUrl is false, throw a TypeError.

    -
  7. -

    Set beacon’s method to method.

    -
  8. -

    If options has a timeout member, then set pendingBeacon’s timeout to options’s timeout.

    -
  9. -

    If options has a backgroundTimeout member, then set pendingBeacon’s backgroundTimeout to options’s backgroundTimeout.

    -
-
-
The url getter steps are to return this's beacon's url.
-
The method getter steps are to return this's beacon's method.
-
The timeout getter steps are to return this's beacon's foreground timeout.
-
- The timeout setter steps are: -
    -
  1. -

    Let beacon be this's beacon.

    -
  2. -

    If beacon’s is_pending is not true, throw a "NoModificationAllowedError" DOMException.

    -
  3. -

    Let timeout be the argument to the setter.

    -
  4. -

    If timeout is not a non-negative integer, throw a TypeError.

    -
  5. -

    If the result of setting beacon’s foreground timeout to timeout is false, throw a TypeError.

    -
-
-
The backgroundTimeout getter steps are to return this's beacon's background timeout.
-
- The backgroundTimeout setter steps are: -
    -
  1. -

    Let beacon be this's beacon.

    -
  2. -

    If beacon’s is_pending is not true, throw a "NoModificationAllowedError" DOMException.

    -
  3. -

    Let timeout be the argument to the setter.

    -
  4. -

    If timeout is not a non-negative integer, throw a TypeError.

    -
  5. -

    If the result of setting beacon’s background timeout to timeout is false, throw a TypeError.

    -
-
-
The pending getter steps are to return this's beacon's is_pending flag.
-
- The deactivate() steps are: -
    -
  1. -

    Let beacon be this's beacon.

    -
  2. -

    If beacon’s is_pending is not true, throw an "InvalidStateError" DOMException.

    -
  3. -

    cancel beacon.

    -
-
-
- The sendNow() steps are: -
    -
  1. -

    Let beacon be this's beacon.

    -
  2. -

    If beacon’s is_pending is not true, throw an "InvalidStateError" DOMException.

    -
  3. -

    Call send a queued pending beacon with beacon.

    -
-
-
- The setUrl(url) steps are: -
    -
  1. -

    Let beacon be this's beacon.

    -
  2. -

    If beacon’s is_pending is not true, throw a "NoModificationAllowedError" DOMException.

    -
  3. -

    If url is not a valid URL string, throw a TypeError.

    -
  4. -

    Let base be the entry settings object’s API base URL.

    -
  5. -

    Let parsedUrl be the result of running the URL parser on url and base.

    -
  6. -

    If parsedUrl is failure, throw a TypeError.

    -
  7. -

    If the result of setting beacon’s url to parsedUrl is false, throw a TypeError.

    -
-
-
- The setData(data) steps are: -
    -
  1. -

    Let beacon be this's beacon.

    -
  2. -

    If beacon’s is_pending is not true, throw a "NoModificationAllowedError" DOMException.

    -
  3. -

    Let (body, contentType) be the result of extracting a body with type from data with keepalive set to true.

    -
  4. -

    Let bytes be the byte sequence obtained by reading body’s stream.

    -
  5. -

    If the result of setting beacon’s payload to bytes is false, throw a TypeError.

    -
-
-

5. Privacy

-

This section is woefully incomplete. These all need to be fleshed out in -enough detail to accurately describe the privacy issues and suggested or -prescribed mitigations.

- -
-
-

Conformance

-

Document conventions

-

Conformance requirements are expressed - with a combination of descriptive assertions - and RFC 2119 terminology. - The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” - in the normative parts of this document - are to be interpreted as described in RFC 2119. - However, for readability, - these words do not appear in all uppercase letters in this specification.

-

All of the text of this specification is normative - except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

-

Examples in this specification are introduced with the words “for example” - or are set apart from the normative text - with class="example", - like this:

-
- -

This is an example of an informative example.

-
-

Informative notes begin with the word “Note” - and are set apart from the normative text - with class="note", - like this:

-

Note, this is an informative note.

-

Conformant Algorithms

-

Requirements phrased in the imperative as part of algorithms - (such as "strip any leading space characters" - or "return false and abort these steps") - are to be interpreted with the meaning of the key word - ("must", "should", "may", etc) - used in introducing the algorithm.

-

Conformance requirements phrased as algorithms or specific steps - can be implemented in any manner, - so long as the end result is equivalent. - In particular, the algorithms defined in this specification - are intended to be easy to understand - and are not intended to be performant. - Implementers are encouraged to optimize.

-
- -

Index

-

Terms defined by this specification

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Terms defined by reference

- -

References

-

Normative References

-
-
[DOM] -
Anne van Kesteren. DOM Standard. Living Standard. URL: https://dom.spec.whatwg.org/ -
[FETCH] -
Anne van Kesteren. Fetch Standard. Living Standard. URL: https://fetch.spec.whatwg.org/ -
[HTML] -
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/ -
[INFRA] -
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/ -
[RFC2119] -
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119 -
[SECURE-CONTEXTS] -
Mike West. Secure Contexts. 18 September 2021. CR. URL: https://www.w3.org/TR/secure-contexts/ -
[URL] -
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/ -
[WEBIDL] -
Edgar Chen; Timothy Gu. Web IDL Standard. Living Standard. URL: https://webidl.spec.whatwg.org/ -
-

Informative References

-
-
[PAGE-VISIBILITY] -
Jatinder Mann; Arvind Jain. Page Visibility (Second Edition). 29 October 2013. REC. URL: https://www.w3.org/TR/page-visibility/ -
-

IDL Index

-
enum BeaconMethod {
-    "POST",
-    "GET"
-};
-
-dictionary PendingBeaconOptions {
-    unsigned long timeout;
-    unsigned long backgroundTimeout;
-};
-
-[Exposed=(Window, Worker)]
-interface PendingBeacon {
-    readonly attribute USVString url;
-    readonly attribute BeaconMethod method;
-    attribute unsigned long timeout;
-    attribute unsigned long backgroundTimeout;
-    readonly attribute boolean pending;
-
-    undefined deactivate();
-    undefined sendNow();
-};
-
-[Exposed=(Window, Worker)]
-interface PendingGetBeacon : PendingBeacon {
-    constructor(USVString url, optional PendingBeaconOptions options = {});
-
-    undefined setUrl(USVString url);
-};
-
-[Exposed=(Window, Worker)]
-interface PendingPostBeacon : PendingBeacon {
-    constructor(USVString url, optional PendingBeaconOptions options = {});
-
-    undefined setData(object data);
-};
-
-
-

Issues Index

-
-
This introduction needs to be more of an introduction.
-
Add worker beacons as well?
-
Define these to be part of the user agent formally.
-
This algorithm should also synchronously set or clear a timer to send the beacon.
-
"Check permission" is not defined. A specific permission should be used -here, and this should integrate with the permissions API.
-
headerList is not defined.
-
"discarded" is not well defined.
-
The concepts of "process" and "crashes" are not well defined.
-
"visibility state change" should be more specific here, and should refer -to specific steps in either [PAGE-VISIBILITY] or [HTML]
-
This should also disable any foreground timers for the document’s beacons, -and there should be a step to reinstate them if the document becomes visible -again before they are sent.
-
This section is woefully incomplete. These all need to be fleshed out in -enough detail to accurately describe the privacy issues and suggested or -prescribed mitigations.
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - \ No newline at end of file From ebe91cc9f4d4c527754c98b82d1916b45ee1cf9d Mon Sep 17 00:00:00 2001 From: Ming-Ying Chung Date: Mon, 5 Sep 2022 17:59:54 +0900 Subject: [PATCH 2/3] Update Privacy section to clarify why beacon needs to be stopped when network changes Specifically, beacons should not leak information that it should not know, which only happens when a page goes into bfcache Fix #28 --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index c39f48b..e978282 100644 --- a/README.md +++ b/README.md @@ -305,8 +305,8 @@ special support may be needed to allow extension authors to block the sending (o Specifically, beacons will have the following privacy requirements: * Beacons must be sent over HTTPS. -* Beacons are only sent over the same network that was active when the beacon was registered - (e.g. if the user goes offline and moves to a new network, discard pending beacons). +* Beacons must not leak information that a page should not know when it is in bfcache, + e.g. if network changes after a page goes into bfcache, the beacon should not be sent; if the page then goes out of bfcache, the beacon can be sent. * Delete pending beacons for a site if a user clears site data. * Beacons registered in an incognito session do not persist to disk. * Follow third-party cookie rules for beacons. From 5d0e7c918f7dbc0ad81bd1297a5d037db79421e0 Mon Sep 17 00:00:00 2001 From: Ming-Ying Chung Date: Wed, 14 Sep 2022 13:51:23 +0900 Subject: [PATCH 3/3] Addressed comments and update other items in privacy section --- README.md | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index e978282..14de45f 100644 --- a/README.md +++ b/README.md @@ -304,16 +304,26 @@ special support may be needed to allow extension authors to block the sending (o Specifically, beacons will have the following privacy requirements: -* Beacons must be sent over HTTPS. -* Beacons must not leak information that a page should not know when it is in bfcache, - e.g. if network changes after a page goes into bfcache, the beacon should not be sent; if the page then goes out of bfcache, the beacon can be sent. -* Delete pending beacons for a site if a user clears site data. -* Beacons registered in an incognito session do not persist to disk. * Follow third-party cookie rules for beacons. * Post-unload beacons are not sent if background sync is disabled for a site. -* If a page is suspended (for instance, as part of a [bfcache](https://web.dev/bfcache/)), - beacons should be sent within 10 minutes or less of suspension, +* [#30] Beacons must not leak navigation history to the network provider that it should not know. + * If network changes after a page is navigated away, i.e. put into bfcache, the beacon should not be sent through the new network; + If the page is then restored from bfcache, the beacon can be sent. + * If this is difficult to achieve, consider just force sending out all beacons on navigating away. +* [#27]\[TBD\] Beacons must be sent over HTTPS. +* [#34]\[TBD\] Crash Recovery related (if implemented): + * Delete pending beacons for a site if a user clears site data. + * Beacons registered in an incognito session do not persist to disk. +* [#3] If a page is suspended (for instance, as part of a [bfcache]), + beacons should be sent within 30 minutes or less of suspension, to keep the beacon send temporally close to the user's page visit. + Note that beacons lifetime is also capped by the browser's bfcache implementation. + +[#3]: https://github.com/WICG/unload-beacon/issues/3 +[#27]: https://github.com/WICG/unload-beacon/issues/27 +[#30]: https://github.com/WICG/unload-beacon/issues/30 +[#34]: https://github.com/WICG/unload-beacon/issues/34 +[bfcache]: https://web.dev/bfcache/ ## Alternatives Considered