Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft of helm chart with prmetheus and alertmanager as optionals #79

Open
wants to merge 1 commit into
base: k8s-influxdb
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions charts/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
apiVersion: v2
name: indy-monitoring-stack
description: A Helm chart for the indy-node-monitoring stack
type: application
version: 0.0.2

# This is the version number of the indy-node-monitor release version (https://github.com/hyperledger/indy-node-monitor/releases/tag/v0.4.0).
appVersion: "v0.4.0"
89 changes: 89 additions & 0 deletions charts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Helm chart for the Indy-Monitoring-Stack

_**Work in progress, for development use only.**_

## Pre-requisites

* K8s or minikube cluster
* Helm v3+ binaries
* Registered Indy network monitor seed

## Components

Prometheus and alertmanager are now optional and are disabled by default. This is done to simplified the solution and to avoid having the same data in influxDB and in Prometheus. It's done from the values.yaml file.

## Quickstart

### Configuring the deployment

1. Clone and edit the [**extra_vars.template**](./extra_vars.template) to a file called **extra_vars.yaml**.

cp extra_vars.template extra_vars.yaml

2. Edit the inputs. For some reference you can look at the [**config/indy_node_monitor/networks.json**](./config/indy_node_monitor/networks.json) file. You can add as many as you want. You must include a registered network monitor seed for your selected networks.

Here is an example for the soverin network:
```plaintext
inputs:
- name: Sovrin Builder Net
short_name: sbn
genesis_url: https://raw.githubusercontent.com/sovrin-foundation/sovrin/stable/sovrin/pool_transactions_builder_genesis
network_monitor_seed: INSERT_REGISTERED_NETWORK_MONITOR_SEED_HERE
- name: Sovrin Staging Net
short_name: ssn
genesis_url: https://raw.githubusercontent.com/sovrin-foundation/sovrin/stable/sovrin/pool_transactions_sandbox_genesis
network_monitor_seed: INSERT_REGISTERED_NETWORK_MONITOR_SEED_HERE
- name: Sovrin Main Net
short_name: smn
genesis_url: https://raw.githubusercontent.com/sovrin-foundation/sovrin/stable/sovrin/pool_transactions_live_genesis
network_monitor_seed: INSERT_REGISTERED_NETWORK_MONITOR_SEED_HERE
```
3. The secrets are generated by helm and are kept encode in k8s secret.

4. (optional) If you want to expose services, set the **ingress** or **route** to `True`, enter your **domain** and **endpoints** (you can expose via ingress or via a route if this is more convenient).

5. In order to get alerts for a specific blockchain, you must edit the provided alarm file **grafana/provisioning/alerting/indy-node-monitor.yaml** and replace "Sovrin Builder Net" by your own Network Name.

### Deployment

Once you're happy with the configuration, create the project (for openshit) or namespace (for k8s) and deploy the stack. Here's a Openshift one liner that will take care of this for you. Make sure that the namespace name matches the one in **extra_vars.yaml** file.

```plaintext
oc new-project inc-indy-node-monitor
```

Here is the helm on liner to deploy the stack:

```plaintext
helm upgrade indy-monitoring-stack . \
--namespace inc-indy-node-monitor \
--values ./extra_vars.yaml \
--create-namespace --install
```

This chart will deploy by default, 4 components that are, **indy-no-monitor**, **telegraf**, **influxdb**, **grafana**
After the deployment you can get the Grafana admin password from this secret:

```plaintext
kubectl get secret credentials -n inc-indy-node-monitor -o jsonpath="{.data.GF_SECURITY_ADMIN_PASSWORD}" | base64 --decode; echo
```

Note: On some linux env the last **%** of the password is not part of the password.

## Advanced configuration

You can edit the ports for the applications but this is not recommended. Some ports are statically set in the configuration files and it might break things if you are not sure about what you are doing. It is recommeneded to keep the ports as they are defined in the [**values.yaml**](./values.yaml) file.

You can enable cloudwatch metric which will configure telegraf to send all metric to indy timeseries backend (influxDB or prometheus) and to AWS CLoudwatch. For this to work all container must be set to UTC, if not you will get error that AWS refuse you metric because they are in the future if youre timezone is < GMT -2.

### Service configuration

All service configurations are located in the [**config/**](./config/) folder under their respective application directory. These configurations are loaded as configmaps during deployment and injected into the pods.

You can apply a new configuration by editing these files and redeploying the stack.

### Dashboard development

You can export a dashboard from grafana after you customized it and add the ***.json** file generated under [**config/grafana/dashboards/**](./config/grafana/dashboards/)

All dashboards from that directory are automatically loaded when redeploying the stack.
1 change: 1 addition & 0 deletions charts/config/alertmanager/config.yml
1 change: 1 addition & 0 deletions charts/config/grafana/alerting/indy-monitor.yaml
1 change: 1 addition & 0 deletions charts/config/grafana/dashboard.yml
1 change: 1 addition & 0 deletions charts/config/grafana/dashboards/AlertsDashboard.json
1 change: 1 addition & 0 deletions charts/config/grafana/dashboards/NodeDetailDashboard.json
1 change: 1 addition & 0 deletions charts/config/grafana/dashboards/PublicDashboard.json
1 change: 1 addition & 0 deletions charts/config/grafana/dashboards/TestDashboard.json
1 change: 1 addition & 0 deletions charts/config/grafana/datasource.yml
1 change: 1 addition & 0 deletions charts/config/influxdb/influxdb.conf
1 change: 1 addition & 0 deletions charts/config/prometheus/alert.rules
1 change: 1 addition & 0 deletions charts/config/prometheus/prometheus.yml
30 changes: 30 additions & 0 deletions charts/extra_vars.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
namespace: "indy-monitoring-stack"
ingress:
active: false
domain: ""
cert_manager:
active: false
issuer: letsencrypt-staging
endpoints:
grafana: ""
influxdb: ""
prometheus: ""
alertmanager: ""
indy_node_monitor: ""

# Here's an example for inputs, replace with the networks you want to monitor.
inputs:
# - name: Sovrin Builder Net
# short_name: sbn
# genesis_url: https://raw.githubusercontent.com/sovrin-foundation/sovrin/stable/sovrin/pool_transactions_builder_genesis
# network_monitor_seed: INSERT_REGISTERED_NETWORK_MONITOR_SEED_HERE
# - name: Sovrin Staging Net
# short_name: ssn
# genesis_url: https://raw.githubusercontent.com/sovrin-foundation/sovrin/stable/sovrin/pool_transactions_sandbox_genesis
# network_monitor_seed: INSERT_REGISTERED_NETWORK_MONITOR_SEED_HERE
# - name: Sovrin Main Net
# short_name: smn
# genesis_url: https://raw.githubusercontent.com/sovrin-foundation/sovrin/stable/sovrin/pool_transactions_live_genesis
# network_monitor_seed: INSERT_REGISTERED_NETWORK_MONITOR_SEED_HERE

43 changes: 43 additions & 0 deletions charts/extra_vars.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
namespace: "inc-indy-node-monitor"
ingress:
active: false
domain: "dev.identite.pes.qc"
cert_manager:
active: false
issuer: letsencrypt-staging
endpoints:
grafana: "grafana"
influxdb: ""
prometheus: ""
alertmanager: ""
indy_node_monitor: ""

route:
active: true
domain: "dev.identite.pes.qc"
endpoints:
grafana: "grafana"
influxdb: ""
prometheus: ""
alertmanager: ""
indy_node_monitor: ""

# Here's an example for inputs, replace with the networks you want to monitor.
inputs:
- name: QC-Net-beta
short_name: qcb
genesisUrl: https://raw.githubusercontent.com/MCN-ING/candy-registre/main/env/beta/pool_transactions_genesis
network_monitor_seed: MBkoqteRjceo4rxaBkqCKoc3MGpERKZs
- name: CANdy Dev Network (CANdy-dev)
short_name: cdn
genesisUrl: https://raw.githubusercontent.com/ICCS-ISAC/dtrust-reconu/main/CANdy/dev/pool_transactions_genesis
network_monitor_seed: 01234567890123456789012345678901
- name: CANdy Test Network (CANdy-test)
short_name: ctn
genesisUrl: https://raw.githubusercontent.com/ICCS-ISAC/dtrust-reconu/main/CANdy/test/pool_transactions_genesis
network_monitor_seed: aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpP
- name: CANdy Production Network (CANdy-prod)
short_name: cpn
genesisUrl: https://raw.githubusercontent.com/ICCS-ISAC/dtrust-reconu/main/CANdy/prod/pool_transactions_genesis
network_monitor_seed:
8 changes: 8 additions & 0 deletions charts/templates/NOTES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
*********************************************************
The deployment has completed successfully!

To retrieve the Grafana generated password, run the following command:

kubectl get secret credentials -n inc-indy-node-monitor -o jsonpath="{.data.GF_SECURITY_ADMIN_PASSWORD}" | base64 --decode; echo

*********************************************************
13 changes: 13 additions & 0 deletions charts/templates/alertmanager/configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{{- if and (eq .Values.alertmanager.enable true) (.Values.alertmanager.port) }}
---
{{- with .Values.alertmanager }}
apiVersion: v1
kind: ConfigMap
metadata:
namespace: {{ $.Values.namespace }}
name: "{{ .name }}-configmap"
data:
{{ ($.Files.Glob "config/alertmanager/config.yml").AsConfig | indent 2 }}

{{- end }}
{{- end }}
51 changes: 51 additions & 0 deletions charts/templates/alertmanager/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
{{- if and (eq .Values.alertmanager.enable true) (.Values.alertmanager.port) }}
---
{{- with .Values.alertmanager }}
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: {{ $.Values.namespace }}
name: {{ .name }}
labels:
app: {{ .name }}
spec:
replicas: 1
selector:
matchLabels:
app: {{ .name }}
template:
metadata:
labels:
app: {{ .name }}
spec:
containers:
- name: {{ .name }}
image: "{{ $.Values.ecr.registry }}/{{ .repository }}:{{ .tag }}"
resources:
requests:
memory: {{ .requests_mem }}
cpu: {{ .requests_cpu }}
limits:
memory: {{ .limits_mem }}
cpu: {{ .limits_cpu }}
ports:
- containerPort: {{ .port }}
command:
- "alertmanager"
- "--config.file=/etc/alertmanager/config.yml"
- "--storage.path=/alertmanager"
volumeMounts:
- mountPath: /etc/alertmanager/config.yml
name: "{{ .name }}-config"
subPath: config.yml
volumes:
- name: "{{ .name }}-config"
configMap:
name: "{{ .name }}-configmap"
items:
- key: config.yml
path: config.yml
imagePullSecrets:
- name: ecr-docker-secret
{{- end }}
{{- end }}
34 changes: 34 additions & 0 deletions charts/templates/alertmanager/ingress.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{{- if and (eq .Values.alertmanager.enable true) (eq .Values.ingress.active true) (.Values.ingress.endpoints.alertmanager) }}
---
{{- with .Values.alertmanager }}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
namespace: {{ $.Values.namespace }}
name: {{ .name }}-ingress
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/ssl-redirect: "true"
{{- if and (eq $.Values.ingress.cert_manager.active true) ($.Values.ingress.endpoints.alertmanager) }}
cert-manager.io/cluster-issuer: "{{ $.Values.ingress.cert_manager.issuer }}"
{{- end }}
spec:
rules:
- host: "{{ $.Values.ingress.endpoints.alertmanager }}.{{ $.Values.ingress.domain }}"
http:
paths:
- backend:
service:
name: "{{ .name }}"
port:
number: {{ .port }}
path: /
pathType: ImplementationSpecific
{{- if and (eq $.Values.ingress.cert_manager.active true) ($.Values.ingress.endpoints.alertmanager) }}
tls:
- hosts:
- {{ $.Values.ingress.endpoints.alertmanager }}.{{ $.Values.ingress.domain }}
secretName: {{ $.Values.namespace }}-tls
{{- end }}
{{- end }}
{{- end }}
17 changes: 17 additions & 0 deletions charts/templates/alertmanager/service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{{- if and (eq .Values.alertmanager.enable true) (.Values.alertmanager.port) }}
---
{{- with .Values.alertmanager }}
apiVersion: v1
kind: Service
metadata:
namespace: {{ $.Values.namespace }}
name: "{{ .name }}"
spec:
selector:
app: {{ .name }}
ports:
- protocol: TCP
port: {{ .port }}
targetPort: {{ .port }}
{{- end }}
{{- end }}
12 changes: 12 additions & 0 deletions charts/templates/grafana/configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
{{- with .Values.grafana }}
apiVersion: v1
kind: ConfigMap
metadata:
name: "{{ .name }}-configmap"
data:
{{ ($.Files.Glob "config/grafana/datasource.yml").AsConfig | indent 2 }}
{{ ($.Files.Glob "config/grafana/dashboard.yml").AsConfig | indent 2 }}
{{ ($.Files.Glob "config/grafana/dashboards/*.json").AsConfig | indent 2 }}
{{ ($.Files.Glob "config/grafana/alerting/*.yaml").AsConfig | indent 2 }}
{{- end }}
Loading