Skip to content

Commit

Permalink
Update of the docs (#409)
Browse files Browse the repository at this point in the history
* first iteration of the docs
* fix styling
* fix spelling
* add missing values from the common library
* Take suggestions from `nri-kube-events` PR
* address suggestions from the PR
* more nit picking
  • Loading branch information
kang-makes authored May 4, 2022
1 parent 0799639 commit aa3e8c0
Show file tree
Hide file tree
Showing 6 changed files with 310 additions and 62 deletions.
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,18 @@ For troubleshooting, see
[Not seeing data](https://docs.newrelic.com/docs/integrations/host-integrations/troubleshooting/kubernetes-integration-troubleshooting-not-seeing-data)
or [Error messages](https://docs.newrelic.com/docs/integrations/host-integrations/troubleshooting/kubernetes-integration-troubleshooting-error-messages).

## Helm chart

You can install this chart using [`nri-bundle`](https://github.com/newrelic/helm-charts/tree/master/charts/nri-bundle) located in the
[helm-charts repository](https://github.com/newrelic/helm-charts) or directly from this repository by adding this Helm repository:

```shell
helm repo add nri-kubernetes https://newrelic.github.io/nri-kubernetes
helm upgrade --install nri-kubernetes/newrelic-infrastructure -f your-custom-values.yaml
```

For further information of the configuration needed for the chart just read the [chart's README](/charts/newrelic-infrastructure/README.md).

## Usage

Learn how to
Expand Down
10 changes: 7 additions & 3 deletions charts/newrelic-infrastructure/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ home: https://docs.newrelic.com/docs/kubernetes-pixie/kubernetes-integration/get
icon: https://newrelic.com/themes/custom/curio/assets/mediakit/NR_logo_Horizontal.svg
sources:
- https://github.com/newrelic/nri-kubernetes/
- https://github.com/newrelic/helm-charts/tree/master/charts/newrelic-infrastructure
- https://github.com/newrelic/nri-kubernetes/tree/master/charts/newrelic-infrastructure
- https://github.com/newrelic/infrastructure-agent/

version: 3.3.1
version: 3.3.2
appVersion: 3.1.1
kubeVersion: ">=1.16.0-0"

dependencies:
- name: common-library
Expand All @@ -21,10 +21,14 @@ maintainers:
url: https://github.com/alvarocabanas
- name: carlossscastro
url: https://github.com/carlossscastro
- name: sigilioso
url: https://github.com/sigilioso
- name: gsanchezgavier
url: https://github.com/gsanchezgavier
- name: kang-makes
url: https://github.com/kang-makes
- name: marcsanmi
url: https://github.com/marcsanmi
- name: paologallinaharbur
url: https://github.com/paologallinaharbur
- name: roobre
Expand Down
149 changes: 128 additions & 21 deletions charts/newrelic-infrastructure/README.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,120 @@
[![New Relic Experimental header](https://github.com/newrelic/opensource-website/raw/master/src/images/categories/Experimental.png)](https://opensource.newrelic.com/oss-category/#new-relic-experimental)

# newrelic-infrastructure

![Version: 3.2.0](https://img.shields.io/badge/Version-3.2.0-informational?style=flat-square) ![AppVersion: 3.2.0](https://img.shields.io/badge/AppVersion-3.2.0-informational?style=flat-square)
![Version: 3.3.2](https://img.shields.io/badge/Version-3.3.2-informational?style=flat-square) ![AppVersion: 3.1.1](https://img.shields.io/badge/AppVersion-3.1.1-informational?style=flat-square)

A Helm chart to deploy the New Relic Kubernetes monitoring solution

**Homepage:** <https://docs.newrelic.com/docs/kubernetes-pixie/kubernetes-integration/get-started/introduction-kubernetes-integration/>

# Helm installation

You can install this chart using [`nri-bundle`](https://github.com/newrelic/helm-charts/tree/master/charts/nri-bundle) located in the
[helm-charts repository](https://github.com/newrelic/helm-charts) or directly from this repository by adding this Helm repository:

```shell
helm repo add nri-kube-events https://newrelic.github.io/nri-kube-events
helm upgrade --install nri-kube-events/nri-kube-events -f your-custom-values.yaml
```

## Source Code

* <https://github.com/newrelic/nri-kubernetes/>
* <https://github.com/newrelic/helm-charts/tree/master/charts/newrelic-infrastructure>
* <https://github.com/newrelic/nri-kubernetes/tree/master/charts/newrelic-infrastructure>
* <https://github.com/newrelic/infrastructure-agent/>

## Values managed globally

This chart implements the [New Relic's common Helm library](https://github.com/newrelic/helm-charts/tree/master/library/common-library) which
means that it honors a wide range of defaults and globals common to most New Relic Helm charts.

Options that can be defined globally include `affinity`, `nodeSelector`, `tolerations`, `proxy` and others. The full list can be found at
[user's guide of the common library](https://github.com/newrelic/helm-charts/blob/master/library/common-library/README.md).

## Chart particularities

### Low data mode
There are two mechanisms to reduce the amount of data that this integration sends to New Relic. See this snippet from the `values.yaml` file:
```yaml
common:
config:
interval: 15s

lowDataMode: false
```
The `lowDataMode` toggle is the simplest way to reduce data send to Newrelic. Setting it to `true` changes the default scrape interval from 15 seconds
(the default) to 30 seconds.

If you need for some reason to fine-tune the number of seconds you can use `common.config.interval` directly. If you take a look at the `values.yaml`
file, the value there is `nil`. If any value is set there, the `lowDataMode` toggle is ignored as this value takes precedence.

Setting this interval above 40 seconds can make you experience issues with the Kubernetes Cluster Explorer so this chart limits setting the interval
inside the range of 10 to 40 seconds.

### Affinities and tolerations

The New Relic common library allows to set affinities, tolerations, and node selectors globally using e.g. `.global.affinity` to ease the configuration
when you use this chart using `nri-bundle`. This chart has an extra level of granularity to the components that it deploys:
control plane, ksm, and kubelet.

Take this snippet as an example:
```yaml
global:
affinity: {}
affinity: {}
kubelet:
affinity: {}
ksm:
affinity: {}
controlPlane:
affinity: {}
```

The order to set an affinity is to set first any `kubelet.affinity`, `ksm.affinity`, or `controlPlane.affinity`. If these values are empty the chart
fallbacks to `affinity` (at root level), and if that value is empty, the chart fallbacks to `global.affinity`.

The same procedure applies to `nodeSelector` and `tolerations`.

On the other hand, some components have affinities and tolerations predefined e.g. to be able to run kubelet pods on nodes that are tainted as master
nodes or to schedule the KSM scraper on the same node of KSM to reduce the inter-node traffic.

If you are having problems assigning pods to nodes it may be because of this. Take a look at the [`values.yaml`](values.yaml) to see if the pod that is
not having your expected behavior has any predefined value.

### `hostNetwork` toggle

In versions below v3, changing the `privileged` mode affected the `hostNetwork`. We changed this behavior and now you can set pods to use `hostNetwork`
using the corresponding [flags from the common library](https://github.com/newrelic/helm-charts/blob/master/library/common-library/README.md)
(`.global.hostNetwork` and `.hostNetwork`) but the component that scrapes data from the control plane has always set `hostNetwork` enabled by default
(Look in the [`values.yaml`](values.yaml) for `controlPlane.hostNetwork: true`)

This is because the most common configuration of the control plane components is to be configured to listen only to `localhost`.

If your cluster security policy does not allow to use `hostNetwork`, you can disable it control plane monitoring by setting `controlPlane.enabled` to
`false.`

### `privileged` toggle

The default value for `privileged` [from the common library](https://github.com/newrelic/helm-charts/blob/master/library/common-library/README.md) is
`false` but in this particular this chart it is set to `true` (Look in the [`values.yaml`](values.yaml) for `privileged: true`)

## Requirements
This is because when `kubelet` pods need to run in privileged mode to fetch cpu, memory, process, and network metrics of your nodes.

Kubernetes: `>=1.16.0-0`
If your cluster security policy does not allow to have `privileged` in your pod' security context, you can disable it by setting `privileged` to
`false` taking into account that you will lose all the metrics from the host and some metadata from the host that are added to the metrics of the
integrations that you have configured.

## Values

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| affinity | object | `{}` | Sets pod/node affinities set almost globally. (See [Affinities and tolerations](README.md#affinities-and-tolerations)) |
| cluster | string | `""` | Name of the Kubernetes cluster monitored. Can be configured also with `global.cluster` |
| common | object | See `values.yaml` | Config that applies to all instances of the solution: kubelet, ksm, control plane and sidecars. |
| common.agentConfig | object | `{}` | Config for the Infrastructure agent. Will be used by the forwarder sidecars and the agent running integrations. See: https://docs.newrelic.com/docs/infrastructure/install-infrastructure-agent/configuration/infrastructure-agent-configuration-settings/ |
| common.config.interval | duration | `15s` if `lowDataMode == false`, `30s` otherwise. | Intervals larger than 40s are not supported and will cause the NR UI to not behave properly. Any non-nil value will override the `lowDataMode` default. |
| common.config.interval | duration | `15s` (See [Low data mode](README.md#low-data-mode)) | Intervals larger than 40s are not supported and will cause the NR UI to not behave properly. Any non-nil value will override the `lowDataMode` default. |
| containerSecurityContext | object | `{}` | Sets security context (at container level). Can be configured also with `global.containerSecurityContext` |
| controlPlane | object | See `values.yaml` | Configuration for the control plane scraper. |
| controlPlane.affinity | object | Deployed only in master nodes. | Affinity for the control plane DaemonSet. |
| controlPlane.config.apiServer | object | Common settings for most K8s distributions. | API Server monitoring configuration |
Expand All @@ -37,46 +128,62 @@ Kubernetes: `>=1.16.0-0`
| controlPlane.config.scheduler.enabled | bool | `true` | Enable scheduler monitoring. |
| controlPlane.config.timeout | string | `"10s"` | Timeout for the Kubernetes APIs contacted by the integration |
| controlPlane.enabled | bool | `true` | Deploy control plane monitoring component. |
| controlPlane.hostNetwork | bool | `true` | Run Control Plane scraper with `hostNetwork`. `hostNetwork` is required for most control plane configurations, as they only accept connections from localhost. |
| controlPlane.kind | string | `"DaemonSet"` | How to deploy the control plane scraper. If autodiscovery is in use, it should be `DaemonSet`. Advanced users using static endpoints set this to `Deployment` to avoid reporting metrics twice. |
| controlPlane.unprivilegedHostNetwork | bool | `false` | Run Control Plane scraper with `hostNetwork` even if `privileged` is set to false. `hostNetwork` is required for most control plane configurations, as they only accept connections from localhost. |
| customAttributes | object | `{}` | Custom attributes to be added to the data reported by all integrations reporting in the cluster. |
| customAttributes | object | `{}` | Adds extra attributes to the cluster and all the metrics emitted to the backend. Can be configured also with `global.customAttributes` |
| customSecretLicenseKey | string | `""` | In case you don't want to have the license key in you values, this allows you to point to which secret key is the license key located. Can be configured also with `global.customSecretLicenseKey` |
| customSecretName | string | `""` | In case you don't want to have the license key in you values, this allows you to point to a user created secret to get the key from there. Can be configured also with `global.customSecretName` |
| dnsConfig | object | `{}` | Sets pod's dnsConfig. Can be configured also with `global.dnsConfig` |
| fedramp.enabled | bool | `false` | Enables FedRAMP. Can be configured also with `global.fedramp.enabled` |
| fullnameOverride | string | `""` | Override the full name of the release |
| hostNetwork | bool | `false` | Sets pod's hostNetwork. Can be configured also with `global.hostNetwork` |
| images | object | See `values.yaml` | Images used by the chart for the integration and agents. |
| images.agent.repository | string | `"newrelic/infrastructure-bundle"` | Image for the agent and integrations bundle. |
| images.agent.tag | string | `"2.8.8"` | Tag for the agent and integrations bundle. |
| images.forwarder.repository | string | `"newrelic/k8s-events-forwarder"` | Image for the agent sidecar. |
| images.forwarder.tag | string | `"1.24.1"` | Tag for the agent sidecar. |
| images.integration.repository | string | `"newrelic/nri-kubernetes"` | Image for the kubernetes integration. |
| images.integration.tag | string | `"3.1.0"` | Tag for the kubernetes integration. |
| images.agent | object | See `values.yaml` | Image for the New Relic Infrastructure Agent plus integrations. |
| images.forwarder | object | See `values.yaml` | Image for the New Relic Infrastructure Agent sidecar. |
| images.integration | object | See `values.yaml` | Image for the New Relic Kubernetes integration. |
| images.pullSecrets | list | `[]` | The secrets that are needed to pull images from a custom registry. |
| integrations | object | `{}` | Config files for other New Relic integrations that should run in this cluster. |
| ksm | object | See `values.yaml` | Configuration for the Deployment that collects state metrics from KSM (kube-state-metrics). |
| ksm.affinity | object | Deployed in the same node as KSM | Affinity for the control plane DaemonSet. |
| ksm.config.retries | int | `3` | Number of retries after timeout expired |
| ksm.config.timeout | string | `"10s"` | Timeout for the ksm API contacted by the integration |
| ksm.enabled | bool | `true` | Enable cluster state monitoring. Advanced users only. Setting this to `false` is not supported and will break the New Relic experience. |
| ksm.resources | object | 100m/150M -/850M | Resources for the KSM scraper pod. Keep in mind that sharding is not supported at the moment, so memory usage for this component ramps up quickly on large clusters. |
| ksm.tolerations | list | Schedules in all tainted nodes | Affinity for the control plane DaemonSet. |
| kubelet | object | See `values.yaml` | Configuration for the DaemonSet that collects metrics from the Kubelet. |
| kubelet.config.retries | int | `3` | Number of retries after timeout expired |
| kubelet.config.timeout | string | `"10s"` | Timeout for the kubelet APIs contacted by the integration |
| kubelet.enabled | bool | `true` | Enable kubelet monitoring. Advanced users only. Setting this to `false` is not supported and will break the New Relic experience. |
| lowDataMode | bool | `false` | Send less data by incrementing the interval from `15s` (the default when `lowDataMode` is `false` or `nil`) to `30s`. Non-nil values of `common.config.interval` will override this value. |
| kubelet.tolerations | list | Schedules in all tainted nodes | Affinity for the control plane DaemonSet. |
| labels | object | `{}` | Additional labels for chart objects. Can be configured also with `global.labels` |
| licenseKey | string | `""` | This set this license key to use. Can be configured also with `global.licenseKey` |
| lowDataMode | bool | `false` (See [Low data mode](README.md#low-data-mode)) | Send less data by incrementing the interval from `15s` (the default when `lowDataMode` is `false` or `nil`) to `30s`. Non-nil values of `common.config.interval` will override this value. |
| nameOverride | string | `""` | Override the name of the chart |
| nodeSelector | object | `{}` | Sets pod's node selector almost globally. (See [Affinities and tolerations](README.md#affinities-and-tolerations)) |
| nrStaging | bool | `false` | Send the metrics to the staging backend. Requires a valid staging license key. Can be configured also with `global.nrStaging` |
| podAnnotations | object | `{}` | Annotations to be added to all pods created by the integration. |
| podLabels | object | `{}` | Labels to be added to all pods created by the integration. |
| priorityClassName | string | `""` | Pod scheduling priority Ref: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/ |
| privileged | bool | `true` | Run the integration with full access to the host filesystem and network. Running in this mode allows reporting fine-grained cpu, memory, process and network metrics for your nodes. Additionally, it allows control plane monitoring, which requires hostNetwork to work. |
| podLabels | object | `{}` | Additional labels for chart pods. Can be configured also with `global.podLabels` |
| podSecurityContext | object | `{}` | Sets security context (at pod level). Can be configured also with `global.podSecurityContext` |
| priorityClassName | string | `""` | Sets pod's priorityClassName. Can be configured also with `global.priorityClassName` |
| privileged | bool | `true` | Run the integration with full access to the host filesystem and network. Running in this mode allows reporting fine-grained cpu, memory, process and network metrics for your nodes. |
| proxy | string | `""` | Configures the integration to send all HTTP/HTTPS request through the proxy in that URL. The URL should have a standard format like `https://user:password@hostname:port`. Can be configured also with `global.proxy` |
| rbac | object | `{"create":true,"pspEnabled":false}` | Settings controlling RBAC objects creation. |
| rbac.create | bool | `true` | Whether the chart should automatically create the RBAC objects required to run. |
| rbac.pspEnabled | bool | `false` | Whether the chart should create Pod Security Policy objects. |
| securityContext | object | See `values.yaml` | Security context used in all the containers of the pods When `privileged == true`, the Kubelet scraper will run as root and ignore these settings. |
| serviceAccount | object | See `values.yaml` | Settings controlling ServiceAccount creation. |
| serviceAccount.create | bool | `true` | Whether the chart should automatically create the ServiceAccount objects required to run. |
| tolerations | list | `[]` | Sets pod's tolerations to node taints almost globally. (See [Affinities and tolerations](README.md#affinities-and-tolerations)) |
| updateStrategy | object | See `values.yaml` | Update strategy for the DaemonSets deployed. |
| verboseLog | bool | `false` | Enable verbose logging for all components. |
| verboseLog | bool | `false` | Sets the debug logs to this integration or all integrations if it is set globally. Can be configured also with `global.verboseLog` |

## Maintainers

* [alvarocabanas](https://github.com/alvarocabanas)
* [carlossscastro](https://github.com/carlossscastro)
* [sigilioso](https://github.com/sigilioso)
* [gsanchezgavier](https://github.com/gsanchezgavier)
* [kang-makes](https://github.com/kang-makes)
* [marcsanmi](https://github.com/marcsanmi)
* [paologallinaharbur](https://github.com/paologallinaharbur)
* [roobre](https://github.com/roobre)

Expand Down
Loading

0 comments on commit aa3e8c0

Please sign in to comment.