Skip to content

Commit

Permalink
Fix minor issues, apply autoformat
Browse files Browse the repository at this point in the history
Signed-off-by: MOZGIII <[email protected]>
  • Loading branch information
MOZGIII committed Apr 6, 2020
1 parent 14083b2 commit 90ba69f
Showing 1 changed file with 36 additions and 35 deletions.
71 changes: 36 additions & 35 deletions rfcs/2020-04-04-2221-kubernetes-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

This RFC outlines how the Vector will integration with Kubernetes (k8s).

**Note: This RFC is retroactive and meant to seve as an audit to complete our
**Note: This RFC is retroactive and meant to serve as an audit to complete our
Kubernetes integration. At the time of writing this RFC, Vector has already made
considerable progress on it's Kubernetes integration. It has a `kubernetes`
source, `kubernetes_pod_metadata` transform, an example daemonset file, and the
source, `kubernetes_pod_metadata` transform, an example `DaemonSet` file, and the
ability automatically reload configuration when it changes. The fundamental
pieces are mostly in place to complete this integration, but as we approach
the finish line we're being faced with deeper questions that heavily affect the
Expand All @@ -20,7 +20,7 @@ the time of writing this RFC; many large companies, with large production
deployments, depend heavily on Kubernetes. Kubernetes handles log collection
but does not facilitate shipping. Shipping is meant to be delegated to tools
like Vector. This is precisely the use case that Vector was built for. So,
the motivation is three-fold:
motivation is three-fold:

1. A Kubernetes integration is essential to achieving Vector's vision of being
the dominant, single collector for observability data.
Expand All @@ -32,7 +32,7 @@ the motivation is three-fold:

**Note: This guide largely follows the format of our existing guides
([example][guide_example]). There are two perspectives to our guides: 1) A new
user coming from Google 2) A user that is familar with Vector. This guide is
user coming from Google 2) A user that is familiar with Vector. This guide is
from perspective 2.**

This guide covers integrating Vector with Kubernetes. We'll touch on the basic
Expand All @@ -47,35 +47,35 @@ Kubernetes logs and metrics to any destination you please.

Our recommended strategy deploys Vector as a Kubernetes [daemonset]. This is
the most efficient means of collecting Kubernetes observability data since
Vector is guaranteed to deploy _once_ on each of your Pods. In addition,
Vector is guaranteed to deploy _once_ on each of your Nodes. In addition,
we'll use the [`kubernetes_pod_metadata` transform][kubernetes_pod_metadata_transform]
to enrich your logs with Kubernetes context. This transform interacts with
to enrich your logs with the Kubernetes context. This transform interacts with
the Kubernetes watch API to collect cluster metadata and update in real-time
when things change. The following diagram demonstrates how this works:

TODO: insert diagram

### What We'll Accomplish

* Collect data from each of your Kubernetes Pods
* Ability to filter by container name, Pod IDs, and namespaces.
* Automatically merge logs that Kubernetes splits.
* Enrich your logs with useful Kubernetes context.
* Send your logs to one or more destinations.
- Collect data from each of your Kubernetes Pods
- Ability to filter by container names, Pod IDs, and namespaces.
- Automatically merge logs that Kubernetes splits.
- Enrich your logs with useful Kubernetes context.
- Send your logs to one or more destinations.

### Tutorial

#### Kubectl Interface

1. Configure Vector:

Before we can deplo Vector we must configure. This is done by creating
Before we can deploy Vector we must configure. This is done by creating
a Kubernetes `ConfigMap`:

...insert selector to select any of Vector's sinks...

```bash
echo '
cat <<-CONFIG > vector-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
Expand Down Expand Up @@ -107,17 +107,17 @@ TODO: insert diagram
compression = "gzip"
region = "us-east-1"
key_prefix = "date=%F/"
' > vector-configmap.toml
CONFIG
```
2. Deploy Vector!
Now that you have your custom `ConfigMap` ready it's time to deploy
Vector. To ensure Vector is isolated and has the necessary permissions
we must create a `namespace`, `ServiceAccount`, `ClusterRole`, and
we must create a `Namespace`, `ServiceAccount`, `ClusterRole`, and
`ClusterRoleBinding`:
```bash
```shell
kubectl create namespace logging
kubectl create -f vector-service-account.yaml
kubectl create -f vector-role.yaml
Expand All @@ -126,7 +126,7 @@ TODO: insert diagram
kubectl create -f vector-daemonset.yaml
```
* *See [outstanding questions 3, 4, 5, 6, and 7](#outstanding-questions).*
- _See [outstanding questions 3, 4, 5, 6, and 7](#outstanding-questions)._
That's it!
Expand All @@ -138,14 +138,14 @@ TODO: fill in
1. [Filebeat k8s integration]
1. [Fluentbit k8s integration]
2. [Fluentd k8s integration]
3. [LogDNA k8s integration]
4. [Honeycomb integration]
3. [Bonzai logging operator] - This is approach is likely outside of the scope
1. [Fluentd k8s integration]
1. [LogDNA k8s integration]
1. [Honeycomb integration]
1. [Bonzai logging operator] - This is approach is likely outside of the scope
of Vector's initial Kubernetes integration because it focuses more on
deployment strategies and topologies. There are likely some very useful
and interesting tactics in their approach though.
4. [Influx Helm charts]
1. [Influx Helm charts]
## Sales Pitch
Expand All @@ -157,23 +157,23 @@ See [motivation](#motivation).
## Alternatives
1. Not do this integration and rely solely on external community driven
1. Not do this integration and rely solely on external community-driven
integrations.
## Outstanding Questions
1. What is the minimal Kubernetes version that we want to support. See
[this comment][kubernetes_version_comment].
1. What is the best to avoid Vector from ingesting it's own logs? I'm assuming
that my [`kubectl` tutoria](#kubectl-interface) handles this with namespaces?
We'd just need to configure Vector to excluse this namespace?
that my [`kubectl` tutorial](#kubectl-interface) handles this with namespaces?
We'd just need to configure Vector to exclude this namespace?
1. I've seen two different installation strategies. For example, Fluentd offers
a [single daemonset configuration file][fluentd_daemonset] while Fluentbit
offers [four separate configuration files][fluentbit_installation]
(`service-account.yaml`, `role.yaml`, `role-binding.yaml`, `configmap.yaml`).
Which approach is better? Why are they different?
1. Should we prefer `kubectl create ...` or `kubectl apply ...`? The examples
in the [prior art](#prior-art) section use both.
in the [prior art](#prior-art) section use both.
1. From what I understand, Vector requires the Kubernetes `watch` verb in order
to receive updates to k8s cluster changes. This is required for the
`kubernetes_pod_metadata` transform. Yet, Fluentbit [requires the `get`,
Expand All @@ -186,7 +186,7 @@ See [motivation](#motivation).
For example [LogDNA's daemonset][logdna_daemonset]. I assume this is limiting
resources. Do we want to consider this?
1. What the hell is going on with [Honeycomb's integration
strategy][Hoenycomb integration]? :) It seems like the whole "Heapster"
strategy][honeycomb integration]? :) It seems like the whole "Heapster"
pipeline is specifically for system events, but Heapster is deprecated?
This leads me to my next question...
1. How are we collecting Kubernetes system events? Is that outside of the
Expand All @@ -202,7 +202,7 @@ See [motivation](#motivation).
- [ ] Support for customizable k8s clusters. See [issue#2170].
- [ ] Look into [issue#2225] and see if we can include it as part of this
work.
- [ ] Stabilize k8s integration tests. See [isue#2193], [issue#2216],
- [ ] Stabilize k8s integration tests. See [issue#2193], [issue#2216],
and [issue#1635].
- [ ] Ensure we are testing all supported minor versions. See
[issue#2223].
Expand All @@ -220,33 +220,34 @@ See [motivation](#motivation).
- [ ] Add Kubernetes setup/integration guide.
- [ ] Release `0.10.0` and announce.
[Bonzai logging operator]: https://github.com/banzaicloud/logging-operator
[bonzai logging operator]: https://github.com/banzaicloud/logging-operator
[daemonset]: https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/
[Filebeat k8s integration]: https://www.elastic.co/guide/en/beats/filebeat/master/running-on-kubernetes.html
[Fluentbit k8s integration]: https://docs.fluentbit.io/manual/installation/kubernetes
[filebeat k8s integration]: https://www.elastic.co/guide/en/beats/filebeat/master/running-on-kubernetes.html
[fluentbit k8s integration]: https://docs.fluentbit.io/manual/installation/kubernetes
[fluentbit_daemonset]: https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/output/elasticsearch/fluent-bit-ds.yaml
[fluentbit_installation]: https://docs.fluentbit.io/manual/installation/kubernetes#installation
[fluentbit_role]: https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-role.yaml
[Fluentd k8s integration]: https://docs.fluentd.org/v/0.12/articles/kubernetes-fluentd
[fluentd k8s integration]: https://docs.fluentd.org/v/0.12/articles/kubernetes-fluentd
[fluentd_daemonset]: https://github.com/fluent/fluentd-kubernetes-daemonset/blob/master/fluentd-daemonset-papertrail.yaml
[guide_example]: https://vector.dev/guides/integrate/sources/syslog/aws_kinesis_firehose/
[Honeycomb integration]: https://docs.honeycomb.io/getting-data-in/integrations/kubernetes/
[Influx Helm charts]: https://github.com/influxdata/helm-charts
[honeycomb integration]: https://docs.honeycomb.io/getting-data-in/integrations/kubernetes/
[influx helm charts]: https://github.com/influxdata/helm-charts
[issue#1293]: https://github.com/timberio/vector/issues/1293
[issue#1635]: https://github.com/timberio/vector/issues/1635
[issue#1816]: https://github.com/timberio/vector/issues/1867
[issue#1867]: https://github.com/timberio/vector/issues/1867
[issue#1910]: https://github.com/timberio/vector/issues/1910
[issue#2170]: https://github.com/timberio/vector/issues/2170
[issue#2171]: https://github.com/timberio/vector/issues/2171
[issue#2193]: https://github.com/timberio/vector/issues/2193
[issue#2199]: https://github.com/timberio/vector/issues/2199
[issue#2216]: https://github.com/timberio/vector/issues/2216
[issue#2218]: https://github.com/timberio/vector/issues/2218
[issue#2223]: https://github.com/timberio/vector/issues/2223
[issue#2224]: https://github.com/timberio/vector/issues/2224
[issue#2225]: https://github.com/timberio/vector/issues/2225
[kubernetes_version_comment]: https://github.com/timberio/vector/pull/2188#discussion_r403120481
[LogDNA k8s integration]: https://docs.logdna.com/docs/kubernetes
[logdna k8s integration]: https://docs.logdna.com/docs/kubernetes
[logdna_daemonset]: https://raw.githubusercontent.com/logdna/logdna-agent/master/logdna-agent-ds.yaml
[pr#2134]: https://github.com/timberio/vector/pull/2134
[pr#2188]: https://github.com/timberio/vector/pull/2188
Expand Down

0 comments on commit 90ba69f

Please sign in to comment.