Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ML] Server-Sent Events for Inference response
Initial implementation of streaming inference responses as Server-Sent Events for the `POST /_inference` API. Bytes are requested and read from a Flow.Publisher and encoded in a ChunkedRestResponseBodyPart before sent to the REST channel. The channel will request more bytes via the `getNextPart` API. Encoding is done in two parts: 1. A wrapper encoding to format the messages as a Server-Sent Event stream. 2. The existing JSON (or requested) encoding for the data payload using XContent. Example messages: ``` event: message data: { "completion": [{"delta": "hello, world"}] } ```
- Loading branch information