Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retransmission is not specified #34

Open
karmanyaahm opened this issue Jan 6, 2023 · 8 comments
Open

Retransmission is not specified #34

karmanyaahm opened this issue Jan 6, 2023 · 8 comments

Comments

@karmanyaahm
Copy link
Member

karmanyaahm commented Jan 6, 2023

See: https://codeberg.org/iNPUTmice/up/issues/1

How are we handling retries? Should we implement RFC8030's TTL? If so, should it be a MUST not deliver after time or SHOULD not deliver after time (i.e. is TTL a required feature)?

One possible solution from @iNPUTmice:

A push server that implements storage and multiple delivery attempts SHOULD implement the TTL and Topic headers from RFC 8030. According to RFC 8030 §5.2 a push service MAY retain a push message for a shorter duration than requested; since this value can be 0 a push server that does not store messages is still a valid WebPush service. Push servers that do not implement storage MUST include a TTL: 0 header in the response.

@karmanyaahm
Copy link
Member Author

karmanyaahm commented Jan 22, 2023

Adding on to this, what does the push server tell the app server when the distributor/end-user device is temporarily gone. binwiederhier suggested 412, IIRC currently iNPUTmice uses a forced TTL: 0.

@karmanyaahm
Copy link
Member Author

507 Insufficient Storage is also great, because by not implementing/allowing caching, the push server has no storage.

@karmanyaahm
Copy link
Member Author

Branching off binwiederhier/ntfy#609 (comment):
binwiederhier:

"Insuffiencient storage" implies that it's stored on the server, and a "5xx" error implies that it's the server's fault that the request was denied. Also, 507 seems WebDAV specific.

Since there are two clients involved (sender (= traditional client) ---> ntfy server --> subscriber), it's hard to find a good HTTP code, but I'd still say it's a 4xx, no?

Maybe a 409?

409 Conflict - Indicates that the request could not be processed because of conflict in the current state of the resource, such as an [edit conflict](https://en.wikipedia.org/wiki/Edit_conflict) between multiple simultaneous updates.

Or a 412? Probably less so..

412 Precondition Failed - The server does not meet one of the preconditions that the requester put on the request header fields.

Or 422?

422 Unprocessable Entity - The request was well-formed but was unable to be followed due to semantic errors.

I think 409 and 422 work well.

me:

I agree that it's hard to find a good HTTP code, but I don't think it's a 4xx since there's nothing the app server (client) can do about this. It's the push server's "fault" (design limitation) that it doesn't know whether the subscription is valid or not, thus 5xx server error.

@binwiederhier @iNPUTmice @p1gp1g @MayeulC What do y'all think? In addition to TTL: 0?

@binwiederhier
Copy link

I have a very narrow scope for my use case (subscriber-based rate limiting in ntfy, and rejecting messages without subscribers), much narrower than the one in this ticket. So I fear my user case may not be the same as the other one.

That said, for my use case, I think I now believe HTTP 409 is the correct code:

The 409 (Conflict) status code indicates that the request could not be completed due to a conflict with the current state of the target resource. This code is used in situations where the user might be able to resolve the conflict and resubmit the request. The server SHOULD generate content that includes enough information for a user to recognize the source of the conflict.

Conflicts are most likely to occur in response to a PUT request. For example, if versioning were being used and the representation being PUT included changes to a resource that conflict with those made by an earlier (third-party) request, the origin server might use a 409 response to indicate that it can't complete the request. In this case, the response representation would likely contain information useful for merging the differences based on the revision history.

(emphasis mine)

In the case of ntfy, the "target resource" is the topic, and the "conflict with the current state" is that there is no subscriber connected, and we cannot forward the message.

--

As for the general "how do you handle retries" question, the HTTP spec has a Retry-After header the server can respond with that may define retries from the app server.

(It is worth noting that ntfy caches messages for 12h and that they will be transmitted to a subscriber if no-one is connected. Not sure if that solves the problem)

--

(Timeliness note: UP is currently one of the blockers to release my ntfy SaaS offering, so I am eagerly waiting for the implementation of binwiederhier/ntfy#609. I am willing to wait a week or so, but not forever :-) I am well aware that I've been slacking with the review before @karmanyaahm. My apologies for that, and for the rush. It's just been so long and I wanna finally go live.)

@iNPUTmice
Copy link

iNPUTmice commented Feb 21, 2023

My XMPP based distributor doesn’t know if the recipient is temporarily offline. Messages can be on their way for ~5 minutes.
HTTP 200 OK in WebPush doesn’t say anything about end-to-end delivery. It only means the push server took it on.
I’m responding with HTTP 200 OK + TTL: 0 when I don’t have storage (because I tried to send the message to the device.

And that’s valid (and probably expected?) according to WebPush as far as I understand the RFC.

Rate limiting or "client used up all the storage" or something like that can be handled with 429 + Retry-After or similar.

Edit: 201 rather…

@p1gp1g
Copy link
Member

p1gp1g commented Feb 21, 2023

TTL/Status code

when the push server can forward the message

If we do not use 201 + TTL: 0 then compatibility with webpush can be broken, unless the server handle other webpush spec, at least:

  • Delivery of the push message works like the end-user app acknowledging the message (DELETE /message/push_message_resource)
  • The push server must:
    • Handle message update
    • Handle message deletion
    • Check TTL header

So I see 2 solutions for UnifiedPush:

  • The push server MUST respond with 201 + TTL: 0
  • The push server MUST:
    • respond with 201 + TTL: 0
    • OR Handle message update/deletion and check TTL header (and return TTL <= request TTL )

IMO: 201 + TTL: 0 make the spec easier, and we probably reduce side effect when an application server does not use webpush, like matrix.

Rate limit

I think the 429 behavior is adapted

A push service MAY return a 429 (Too Many Requests) status code
[RFC6585] when an application server has exceeded its rate limit for
push message delivery to a push resource. The push service SHOULD
also include a Retry-After header [RFC7231] to indicate how long the
application server is requested to wait before it makes another
request to the push resource.

Other questions

I can't see 2 things in the spec:

  • What the push server must do when the app server sends a message to an unsubscribed endpoint.
  • What the behavior needed when the server app send a push message with respond-async and the push server respond with TTL: 0 ? Should it still answer with 202 ?

-> Maybe we should look at the existing webpush server what their behavior for this 2 questions.

IMO: 201 + TTL: 0 for both.


UnifiedPush Webpush
End-user app user agent
push server push service
endpoint push resources

@binwiederhier
Copy link

(Sidebar: I feel like my ntfy problem is almost entirely distinct, @karmanyaahm; maybe this is the wrong place for the discussion after all?)

@MayeulC
Copy link

MayeulC commented Feb 22, 2023

What the push server must do when the app server sends a message to an unsubscribed endpoint.

404 Not Found seems appropriate? However in ntfy's case it doesn't mean a lot, so maybe that shouldn't be mandated?

I'm a bit out-of-the-loop on WebPush, as I haven't had time to read the specs, so I apologize if my answer is useless. From what I understand from your messages here, 201 is interpreted as a "best-effort"? IMO that's the baseline we should put in the spec. The message may have been sent, no guarantees on retransmission or delivery notification. Since I haven't read the spec, I can't really comment on Retry-After or TTL, but it's probably best to put the duration the message is cached for here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants