Skip to content

Commit

Permalink
[ML] Server-Sent Events for Inference response
Browse files Browse the repository at this point in the history
Initial implementation of streaming inference responses as Server-Sent
Events for the `POST /_inference` API.

Bytes are requested and read from a Flow.Publisher and encoded in a
ChunkedRestResponseBodyPart before sent to the REST channel.  The
channel will request more bytes via the `getNextPart` API.

Encoding is done in two parts:
1. A wrapper encoding to format the messages as a Server-Sent Event
   stream.
2. The existing JSON (or requested) encoding for the data payload using
   XContent.

Example messages:
```
event: message
data: { "completion": [{"delta": "hello, world"}] }

```
  • Loading branch information
prwhelan committed Sep 5, 2024
1 parent c805f90 commit 428b53d
Show file tree
Hide file tree
Showing 2 changed files with 901 additions and 0 deletions.
Loading

0 comments on commit 428b53d

Please sign in to comment.