Skip to content

Commit

Permalink
Merge pull request gruntwork-io#12 from gruntwork-io/add_script_check
Browse files Browse the repository at this point in the history
Add script checks
  • Loading branch information
autero1 authored Jan 11, 2019
2 parents 7d3c2e1 + 02fe94c commit 780e701
Show file tree
Hide file tree
Showing 11 changed files with 556 additions and 49 deletions.
2 changes: 1 addition & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ jobs:
- checkout
- attach_workspace:
at: /go/src/github.com/gruntwork-io/health-checker
- run: run-go-tests --circle-ci-2 --path test
- run: run-go-tests --circle-ci-2

build:
<<: *defaults
Expand Down
56 changes: 53 additions & 3 deletions Gopkg.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

43 changes: 38 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# health-checker

A simple HTTP server that will return `200 OK` if the given TCP ports are all successfully accepting connections.
A simple HTTP server that will return `200 OK` if the configured checks are all successful. If any of the checks fail,
it will return `HTTP 504 Gateway Not Found`.

## Motivation

Expand All @@ -14,15 +15,23 @@ a single TCP port, or an HTTP(S) endpoint. As a result, our use case just isn't
We wrote health-checker so that we could run a daemon on the server that reports the true health of the server by
attempting to open a TCP connection to more than one port when it receives an inbound HTTP request on the given listener.

Using the `--script` -option, the `health-checker` can be extended to check many other targets. One concrete example is monitoring
`ZooKeeper` node status during rolling deployment. Just polling the `ZooKeeper`'s TCP client port doesn't necessarily guarantee
that the node has (re-)joined the cluster. Using the `health-check` with a custom script target, we can
[monitor ZooKeeper](https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_monitoring) using the
[4 letter words](https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_zkCommands), ensuring we report health back to the
[Load Balancer](https://aws.amazon.com/documentation/elastic-load-balancing/) correctly.

## How It Works

When health-checker is started, it will listen for inbound HTTP requests for any URL on the IP address and port specified
by `--listener`. When it receives a request, it will attempt to open TCP connections to each of the ports specified by
an instance of `--port`. If all TCP connections succeed, it will return `HTTP 200 OK`. If any TCP connection fails, it
will return `HTTP 504 Gateway Not Found`.
an instance of `--port` and/or execute the script target specified by `--script`. If all configured checks - all TCP
connections and zero exit status for the script - succeed, it will return `HTTP 200 OK`. If any of the checks fail,
it will return `HTTP 504 Gateway Not Found`.

Configure your AWS Health Check to only pass the Health Check on `HTTP 200 OK`. Now when an HTTP Health Check request
comes in, all desired TCP ports will be checked.
comes in, all desired TCP ports will be checked and the script target executed.

For stability, we recommend running health-checker under a process supervisor such as [supervisord](http://supervisord.org/)
or [systemd](https://www.freedesktop.org/wiki/Software/systemd/) to automatically restart health-checker in the unlikely
Expand All @@ -46,9 +55,13 @@ health-checker [options]
| `--listener` | The IP address and port on which inbound HTTP connections will be accepted. | `0.0.0.0:5000`
| `--log-level` | Set the log level to LEVEL. Must be one of: `panic`, `fatal`, `error,` `warning`, `info`, or `debug` | `info`
| `--help` | Show the help screen | |
| `--script` | Path to script to run - will pass if it completes within configured timeout with a zero exit status. Specify one or more times. | |
| `--script-timeout` | Timeout, in seconds, to wait for the scripts to exit. Applies to all configured script targets. | `5` |
| `--version` | Show the program's version | |

#### Example
If you execute a shell script, ensure you have a `shebang` line in your script, otherwise the script will fail with an `exec format error`.

#### Example 1

Run a listener on port 6000 that accepts all inbound HTTP connections for any URL. When the request is received,
attempt to open TCP connections to port 5432 and 3306. If both succeed, return `HTTP 200 OK`. If any fails, return `HTTP
Expand All @@ -58,3 +71,23 @@ attempt to open TCP connections to port 5432 and 3306. If both succeed, return `
health-checker --listener "0.0.0.0:6000" --port 5432 --port 3306
```

#### Example 2

Run a listener on port 6000 that accepts all inbound HTTP connections for any URL. When the request is received,
attempt to open TCP connection to port 5432 and run the script with a 10 second timout. If TCP connection succeeds and script exit code is zero, return `HTTP 200 OK`. If TCP connection fails or non-zero exit code for the script, return `HTTP
504 Gateway Not Found`.

```
health-checker --listener "0.0.0.0:6000" --port 5432 --script /path/to/script.sh --script-timeout 10
```

#### Example 3

Run a listener on port 6000 that accepts all inbound HTTP connections for any URL. When the request is received,
attempt to run the configured scripts. If both return exit code zero, return `HTTP 200 OK`. If either returns non-zero exit code, return `HTTP
504 Gateway Not Found`.

```
health-checker --listener "0.0.0.0:6000" --script "/usr/local/bin/exhibitor-health-check.sh --exhibitor-port 8080" --script "/usr/local/bin/zk-health-check.sh --zk-port 2191"
```

12 changes: 8 additions & 4 deletions commands/cli.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ func CreateCli(version string) *cli.App {
app.HelpName = app.Name
app.Author = "Gruntwork, Inc. <www.gruntwork.io> | https://github.com/gruntwork-io/health-checker"
app.Version = version
app.Usage = "A simple HTTP server that returns a 200 OK when all given TCP ports accept inbound connections."
app.Usage = "A simple HTTP server that will return 200 OK if the configured checks are all successful."
app.Commands = nil
app.Flags = defaultFlags
app.Action = runHealthChecker
Expand All @@ -52,11 +52,15 @@ func runHealthChecker(cliContext *cli.Context) error {
opts.Logger.Infof("Note: To enable debug mode, set %s to \"true\"", ENV_VAR_NAME_DEBUG_MODE)
return err
}
if err != nil {
if err != nil {
return errors.WithStackTrace(err)
}

opts.Logger.Infof("The Health Check will attempt to connect to the following ports via TCP: %v", opts.Ports)
if len(opts.Ports) > 0 {
opts.Logger.Infof("The Health Check will attempt to connect to the following ports via TCP: %v", opts.Ports)
}
if len(opts.Scripts) > 0 {
opts.Logger.Infof("The Health Check will attempt to run the following scripts: %v", opts.Scripts)
}
opts.Logger.Infof("Listening on Port %s...", opts.Listener)
err = server.StartHttpServer(opts)
if err != nil {
Expand Down
51 changes: 41 additions & 10 deletions commands/flags.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,33 @@ package commands

import (
"fmt"
"github.com/gruntwork-io/health-checker/options"
"github.com/gruntwork-io/gruntwork-cli/logging"
"github.com/urfave/cli"
"github.com/gruntwork-io/health-checker/options"
"github.com/sirupsen/logrus"
"github.com/urfave/cli"
"os"
"strings"
)

const DEFAULT_LISTENER_IP_ADDRESS = "0.0.0.0"
const DEFAULT_LISTENER_PORT = 5500
const DEFAULT_SCRIPT_TIMEOUT_SEC = 5
const ENV_VAR_NAME_DEBUG_MODE = "HEALTH_CHECKER_DEBUG"

var portFlag = cli.IntSliceFlag{
Name: "port",
Usage: fmt.Sprintf("[Required] The port number on which a TCP connection will be attempted. Specify one or more times. Example: 8000"),
Name: "port",
Usage: fmt.Sprintf("[One of port/script Required] The port number on which a TCP connection will be attempted. Specify one or more times. Example: 8000"),
}

var scriptFlag = cli.StringSliceFlag{
Name: "script",
Usage: fmt.Sprintf("[One of port/script Required] The path to script that will be run. Specify one or more times. Example: \"/usr/local/bin/health-check.sh --http-port 8000\""),
}

var scriptTimeoutFlag = cli.IntFlag{
Name: "script-timeout",
Usage: fmt.Sprintf("[Optional] Timeout, in seconds, to wait for the scripts to complete. Example: 10"),
Value: DEFAULT_SCRIPT_TIMEOUT_SEC,
}

var listenerFlag = cli.StringFlag{
Expand All @@ -33,6 +45,8 @@ var logLevelFlag = cli.StringFlag{

var defaultFlags = []cli.Flag{
portFlag,
scriptFlag,
scriptTimeoutFlag,
listenerFlag,
logLevelFlag,
}
Expand All @@ -58,19 +72,27 @@ func parseOptions(cliContext *cli.Context) (*options.Options, error) {
logger.SetLevel(level)

ports := cliContext.IntSlice("port")
if len(ports) == 0 {
return nil, MissingParam(portFlag.Name)

scriptArr := cliContext.StringSlice("script")
scripts := options.ParseScripts(scriptArr)

if len(ports) == 0 && len(scripts) == 0 {
return nil, OneOfParamsRequired{portFlag.Name, scriptFlag.Name}
}

scriptTimeout := cliContext.Int("script-timeout")

listener := cliContext.String("listener")
if listener == "" {
return nil, MissingParam(listenerFlag.Name)
}

return &options.Options{
Ports: ports,
Listener: listener,
Logger: logger,
Ports: ports,
Scripts: scripts,
ScriptTimeout: scriptTimeout,
Listener: listener,
Logger: logger,
}, nil
}

Expand All @@ -95,4 +117,13 @@ type MissingParam string

func (paramName MissingParam) Error() string {
return fmt.Sprintf("Missing required parameter --%s", string(paramName))
}
}

type OneOfParamsRequired struct {
param1 string
param2 string
}

func (paramNames OneOfParamsRequired) Error() string {
return fmt.Sprintf("Missing required parameter, one of --%s / --%s required", string(paramNames.param1), string(paramNames.param2))
}
Loading

0 comments on commit 780e701

Please sign in to comment.