diff --git a/docs/content/3.middleware/2.guides/9.kubernetes-probes.md b/docs/content/3.middleware/2.guides/9.kubernetes-probes.md new file mode 100644 index 0000000000..d7285d4f9d --- /dev/null +++ b/docs/content/3.middleware/2.guides/9.kubernetes-probes.md @@ -0,0 +1,130 @@ +# Kubernetes Probes + +Alokai Cloud customers' middleware and frontend apps are deployed in Kubernetes. This document explains the Kubernetes mechanisms used by Alokai Cloud to ensure that customers' applications are starting, running and exiting correctly. Implementing your application so that it works in accordance with those mechanisms ensures better detectability of issues and lower error rates of your deployment. + +In order for some of those mechanisms to work, you - as the application developer - need to ensure certain REST endpoints exist in your application and that they respond with the correct status code. The sections below explain how to implement those endpoints. + +## Liveness probes + +### The purpose of liveness probes + +Without any kind of probes set up for your application, the only failure scenario where your app will be considered as not working correctly is if the process exits on startup or any time during the application runtime. For example, if you run `npm start` but the application secrets are missing from the environment, most applications' processes will immediately exit. The operating system could also kill the process due to lack of memory. + +Liveness probes can help you recover the application from a broken state when only restart can solve the problem. The application gets probed every few seconds to determine if the server can respond. If a timeout or error response is received instead of a success status code, the app is considered to be in a dead state (e.g. caught in an infinite loop due to an edge-case and developer error) and gets restarted. + +This helps your application keep handling traffic despite an issue that causes it to lock, which gives you time to fix the underlying issue without impacting company operations (as much). + +### Implementation of liveness probes in your application + +In order for your application to be covered by liveness probes, you need to ensure that it responds with a `HTTP 200 OK` status code to a GET request on a `[your app URL]/healthz` endpoint. + +In the case of the Alokai middleware - as of version 3.0.0 of the `@vue-storefront/middleware` package, a liveness probe is enabled by default. You can launch the middleware locally and send a GET request to `http://localhost:4000/healthz` endpoint. The response will be a HTTP 200 OK containing the body "`ok`". + +In the case of the frontend apps, `/healthz` endpoints are also present in Nuxt and Next templates generated from the Alokai CLI. If you will be deploying your own custom fronted app to Alokai Cloud, and your app is not generated from Alokai CLI, you will need to add the `/healthz` endpoint manually. + + + +::danger +The below instructions help you create a simple `/healthz` endpoint that responds with a HTTP 200 OK status code and the text `ok` in the response body. You may be tempted to instead make `/healthz` a real application route in your app, which if queried will respond with the HTML of your actual application. At first glance it may seem more robust, as in theory it tests a larger part of your application stack. +
+In reality, requesting a full app route as part of a liveness check - especially during periods of heavy traffic - can lead to new connection being opened that will never be resolved. A full page app can take hundreds of miliseconds to respond and contain a few kilobytes of payload. The simple route described below will respond in a few miliseconds with a two byte body size. +
+Do not make `/healthz` or `/readyz` a full app route. Instead keep it as simple as possible, by following the instructions below. Do not consider liveness probes as a fully fledged application smoke test, but as a simple check if the app can serve a basic HTTP request. +:: + +#### Nuxt + +In Nuxt, to create a `/healthz` endpoint, use [server endpoints](https://nuxt.com/docs/getting-started/server#server-endpoints-middleware). Create a `[Nuxt fronted app folder]/server/routes/healthz.ts` file with the following contents: +```ts[server/routes/healthz.ts] +export default defineEventHandler(() => 'ok'); +``` + + +#### Next: App router +If using Next.js with app router, use [route handlers](https://nextjs.org/docs/app/api-reference/file-conventions/route): +::steps +#step-1 +Create a `[Next app directory]/app/healthz/route.ts` file + +#step-2 +Paste the following content inside +```ts[app/healthz/route.ts] +import { NextResponse } from 'next/server'; + +export function GET() { + return NextResponse.json({ status: 'ok' }, { status: 200 }); +} +``` +:: + +#### Next: Pages router +If using Next.js with pages router, use [rewrites](https://nextjs.org/docs/pages/api-reference/next-config-js/rewrites) together with [API routes](https://nextjs.org/docs/pages/building-your-application/routing/api-routes): + +::steps +#step-1 +Add the following code to your `next.config.js`: +```ts{3-10}[next.config.js] +module.exports = { + // ... + async rewrites() { + return [ + { + source: '/healthz', + destination: '/api/healthz', + }, + ]; + }, +} +``` +This is necessary because by default Next's API routes have an `/api` subpath, but we need it to be `/healthz` and not `/api/healthz`. + +#step-2 +Create a `[Next app directory]/pages/api/healthz.ts` file + +#step-3 +Paste the following content inside: +```ts[pages/api/healthz.ts] +import { NextApiRequest, NextApiResponse } from 'next'; + +export default function handler(_req: NextApiRequest, res: NextApiResponse) { + res.status(200).send('ok'); +} +``` +:: + +## Readiness probes + +### The purpose of readiness probes + +Applications in Kubernetes are often hosted in such a way that multiple duplicate instances of an application exist side-by-side simultaneously. Such a duplicate instance is called a replica. Readiness probes allow an application replica to temporarily mark itself as unable to serve requests in a Kubernetes cluster. A *liveness* probe can pass while a *readiness* probe fails - meaning that in general, the application is up, but is still waiting for something to happen so that it can serve requests (e.g. waiting for some secondary, dependent service to become online, like Redis cache). + +The `/readyz` endpoint of your application is queried automatically by Alokai Cloud every few seconds to check whether requests should be routed to the queried application replica. One such case - where traffic will should stop directed to a replica - is if an application instance is being killed (if it receives a `SIGTERM` signal). + +You can read more about Kubernetes readiness probes in the [official documentation](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-readiness-probes). + +### Built-in middleware readiness probes + +As of version 5.0.0 of the `@vue-storefront/middleware` package, you can launch the middleware locally and send a GET request to the `http://localhost:4000/readyz` endpoint. The response will contain either a success message or a list of errors describing why the readiness probe failed. + +To add custom readiness probes to the built-in `@vue-storefront/middleware` readiness probe feature, pass them to the `readinessProbes` property when calling `createServer`. + +```ts +const customReadinessProbe = async () => { + const dependentServiceRunning = await axios.get('http://someservice:3000/healthz'); + if(dependentServiceRunning.status !== 200) { + throw new Error('Service that the middleware depends on is offline. The middleware is temporarily not ready to accept connections.') + } +} +const app = await createServer(config, { readinessProbes: [customReadinessProbe]}); +``` + +In order for custom readiness probes to be implemented correctly, they need to do two things: +1. they must all be async or return a promise (the return value is not checked, it's expected to be void/undefined) +2. they must all throw an exception when you want a readiness probe to fail + +### Implementation of readiness probes in your own application + +Readiness probes are more difficult to implement than liveness probes. In liveness probes, a simple stateless REST endpoint handler was sufficient. Readiness probes, on the other hand, need to monitor the signals that the application process received. In addition to that, you can write your own readiness conditions, such as checking if an external service that your application depends on is online. + +If your application uses Node's `http` module to serve HTTP requests, you can use the [`@godaddy/terminus`](https://www.npmjs.com/package/@godaddy/terminus) NPM package to implement readiness checks. + diff --git a/docs/content/3.middleware/2.guides/9.readiness-probes.md b/docs/content/3.middleware/2.guides/9.readiness-probes.md deleted file mode 100644 index 3de19ecd26..0000000000 --- a/docs/content/3.middleware/2.guides/9.readiness-probes.md +++ /dev/null @@ -1,29 +0,0 @@ -## Readiness probes - -Users of Alokai Cloud have their middleware deployed in Kubernetes. Readiness probes allow an application to temporarily mark itself as unable to serve requests in a Kubernetes cluster. - -As of version 5.0.0 of the `@vue-storefront/middleware` package, you can launch the middleware and send a GET request to the `http://localhost:4000/readyz` endpoint. The response will contain either a success message or a list of errors describing why the readiness probe failed. - -The same endpoint is queried automatically by Alokai Cloud every few seconds to check whether requests should be routed to a particular middleware replica. One such case is if a middleware instance is being killed (if it receives a `SIGTERM` signal), it should stop accepting traffic, but wait a few seconds before shutting down to handle any requests that are still being handledby that instance. - -You can read more about Kubernetes readiness probes in the [official documentation](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-readiness-probes). - - - -### Adding custom readiness probes - -To add custom readiness probes, pass them to the `readinessProbes` property when calling `createServer`. - -```ts -const customReadinessProbe = async () => { - const dependentServiceRunning = await axios.get('http://someservice:3000/healthz'); - if(dependentServiceRunning.status !== 200) { - throw new Error('Service that the middleware depends on is offline. The middleware is temporarily not ready to accept connections.') - } -} -const app = await createServer(config, { readinessProbes: [customReadinessProbe]}); -``` - -In order for custom readiness probes to be implemented correctly, they need to do two things: -1. they must all be async or return a promise (the return value is not checked, it's expected to be void/undefined) -2. they must all throw an exception when you want a readiness probe to fail