Skip to content

Commit

Permalink
Merge pull request #3738 from telepresenceio/thallgren/client-only-nat
Browse files Browse the repository at this point in the history
Introduce Virtual Network Address Translation (VNAT)
  • Loading branch information
thallgren authored Dec 6, 2024
2 parents a7b9dd2 + 00d3b1e commit 45d59af
Show file tree
Hide file tree
Showing 40 changed files with 1,142 additions and 349 deletions.
26 changes: 26 additions & 0 deletions CHANGELOG.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,21 @@ items:
- version: 2.21.0
date: TBD
notes:
- type: feature
title: Automatic subnet conflict avoidance
body: ->
Telepresence not only detects when the cluster's subnets are in conflict with subnets on the workstation, it
will also avoid such conflicts by doing network address translations, placing a conflicting subnet in a
virtual subnet.
docs: https://telepresence.io/docs/reference/vpn
- type: feature
title: Virtual Address Translation (VNAT).
body: ->
It is now possible to use a virtual subnet without routing the affected IPs to a specific workload. A new
`telepresence connect --vnat CIDR` flag was added that will perform virtual network address translation of
cluster IPs. This flag is very similar to the `--proxy-via CIDR=WORKLOAD` introduced in 2.19, but without
the need to specify a workload.
docs: https://telepresence.io/docs/reference/vpn
- type: feature
title: Intercepts targeting a specific container
body: ->
Expand Down Expand Up @@ -131,6 +146,17 @@ items:
To achieve this, Telepresence temporarily adds the necessary network to the containerized daemon. This allows the new
container to join the same network. Additionally, Telepresence starts extra socat containers to handle port mapping,
ensuring that the desired ports are exposed to the local environment.
- type: feature
title: Prevent recursion in the Telepresence Virtual Network Interface (VIF)
body: >-
Network problems may arise when running Kubernetes locally (e.g., Docker Desktop, Kind, Minikube, k3s),
because the VIF on the host is also accessible from the cluster's nodes. A request that isn't handled by a
cluster resource might be routed back into the VIF and cause a recursion.
These recursions can now be prevented by setting the client configuration property
`routing.recursionBlockDuration` so that new connection attempts are temporarily blocked for a specific
IP:PORT pair immediately after an initial attempt, thereby effectively ending the recursion.
docs: https://telepresence.io/docs/howtos/cluster-in-vm
- type: feature
title: Allow Helm chart to be included as a sub-chart
body: >-
Expand Down
2 changes: 1 addition & 1 deletion cmd/traffic/cmd/manager/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ func (c *config) Run(ctx context.Context) error {
dlog.Infof(ctx, "Started watcher for ConfigMap %s", cfgConfigMapName)
defer dlog.Infof(ctx, "Ended watcher for ConfigMap %s", cfgConfigMapName)

// The Watch will perform a http GET call to the kubernetes API server, and that connection will not remain open forever
// The WatchConfig will perform a http GET call to the kubernetes API server, and that connection will not remain open forever
// so when it closes, the watch must start over. This goes on until the context is cancelled.
api := k8sapi.GetK8sInterface(ctx).CoreV1()
for ctx.Err() == nil {
Expand Down
Binary file removed docs/images/vpn-proxy-via.jpg
Binary file not shown.
Binary file added docs/images/vpn-vnat.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
15 changes: 8 additions & 7 deletions docs/reference/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,6 @@ Values for `client.cluster` controls aspects on how client's connection to the t
| `mappedNamespaces` | Namespaces that will be mapped by default. | [sequence][yaml-seq] of [strings][yaml-str] | `[]` |
| `connectFromRootDaeamon` | Make connections to the cluster directly from the root daemon. | [boolean][yaml-bool] | `true` |
| `agentPortForward` | Let telepresence-client use port-forwards directly to agents | [boolean][yaml-bool] | `true` |
| `virtualIPSubnet` | The CIDR to use when generating virtual IPs | [CIDR][cidr] | platform dependent |

### DNS

Expand Down Expand Up @@ -208,12 +207,14 @@ Then all of the `alsoProxySubnets` of `10.0.0.0/16` will be proxied, with the ex

These are the valid fields for the `client.routing` key:

| Field | Description | Type |
|---------------------------|----------------------------------------------------------------------------------------|-------------------------|
| `alsoProxySubnets` | Proxy these subnets in addition to the service and pod subnets | [CIDR][cidr] |
| `neverProxySubnets` | Do not proxy these subnets | [CIDR][cidr] |
| `allowConflictingSubnets` | Give Telepresence precedence when these subnets conflict with other network interfaces | [CIDR][cidr] |
| `recursionBlockDuration` | Prevent recursion in VIF for this duration after a connect | [duration][go-duration] |
| Field | Description | Type | Default |
|---------------------------|----------------------------------------------------------------------------------------|-------------------------|--------------------|
| `alsoProxySubnets` | Proxy these subnets in addition to the service and pod subnets | [CIDR][cidr] | |
| `neverProxySubnets` | Do not proxy these subnets | [CIDR][cidr] | |
| `allowConflictingSubnets` | Give Telepresence precedence when these subnets conflict with other network interfaces | [CIDR][cidr] | |
| `recursionBlockDuration` | Prevent recursion in VIF for this duration after a connect | [duration][go-duration] | |
| `virtualSubnet` | The CIDR to use when generating virtual IPs | [CIDR][cidr] | platform dependent |
| `autoResolveConflicts` | Auto resolve conflicts using a virtual subnet | [bool][yaml-bool] | true |


### Timeouts
Expand Down
185 changes: 129 additions & 56 deletions docs/reference/vpn.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,15 @@ title: Telepresence and VPNs

# Telepresence and VPNs

It is often important to set up Kubernetes API server endpoints to be only accessible via a VPN.
In setups like these, users need to connect first to their VPN, and then use Telepresence to connect
to their cluster. As Telepresence uses many of the same underlying technologies that VPNs use,
the two can sometimes conflict. This page will help you identify and resolve such VPN conflicts.
Telepresence creates a virtual network interface (VIF) when it connects. This VIF is configured to route the cluster's
service subnet and pod subnets so that the user can access resources in the cluster. It's not uncommon that the
workstation where Telepresence runs already has network interfaces that route subnets that will overlap. Such
conflicts must be resolved deterministically.

Unless configured otherwise, Telepresence will resolve subnet conflicts by simply moving the cluster's subnet using
network address translation. For a majority of use-cases, this will be enough.

For more info, see the section on how to [avoid the conflict](#avoiding-the-conflict) below.

## VPN Configuration

Expand Down Expand Up @@ -39,7 +44,7 @@ cluster will place resources in. Let's imagine your cluster is configured to pla

![VPN Kubernetes config](../images/vpn-k8s-config.jpg)

## Telepresence conflicts
# Telepresence conflicts

When you run `telepresence connect` to connect to a cluster, it talks to the API server
to figure out what pod and service CIDRs it needs to map in your machine. If it detects
Expand All @@ -53,53 +58,31 @@ telepresence connect: error: connector.Connect: failed to connect to root daemon

Telepresence offers three different ways to resolve this:

- [Allow the conflict](#allowing-the-conflict) in a controlled manner
- [Avoid the conflict](#avoiding-the-conflict) using the `--proxy-via` connect flag
- [Allow the conflict](#allowing-the-conflict) in a controlled manner
- [Use docker](#using-docker) to make telepresence run in a container with its own network config

### Allowing the conflict

One way to resolve this, is to carefully consider what your network layout looks like, and
then allow Telepresence to override the conflicting subnets.
Telepresence is refusing to map them, because mapping them could render certain hosts that
are inside the VPN completely unreachable. However, you (or your network admin) know better
than anyone how hosts are spread out inside your VPN.
Even if the private route routes ALL of `10.0.0.0/8`, it's possible that hosts are only
being spun up in one of the subblocks of the `/8` space. Let's say, for example,
that you happen to know that all your hosts in the VPN are bunched up in the first
half of the space -- `10.0.0.0/9` (and that you know that any new hosts will
only be assigned IP addresses from the `/9` block). In this case you
can configure Telepresence to override the other half of this CIDR block, which is where the
services and pods happen to be.
To do this, all you have to do is configure the `client.routing.allowConflictingSubnets` flag
in the Telepresence helm chart. You can do this directly via `telepresence helm upgrade`:

```console
$ telepresence helm upgrade --set client.routing.allowConflictingSubnets="{10.128.0.0/9}"
```
## Avoiding the conflict

You can also choose to be more specific about this, and only allow the CIDRs that you KNOW
are in use by the cluster:

```console
$ telepresence helm upgrade --set client.routing.allowConflictingSubnets="{10.130.0.0/16,10.132.0.0/16}"
```

The end result of this (assuming an allow list of `/9`) will be a configuration like this:

![VPN Telepresence](../images/vpn-with-tele.jpg)
Telepresence can perform Virtual Network Address Translation (henceforth referred to as VNAT) of the cluster's subnets
when routing them from the workstation, thus moving those subnets so that conflicts are avoided. Unless configured not
to, Telepresence will use VNAT by default when it detects conflicts.

### Avoiding the conflict
VNAT is enabled by passing a `--vnat` flag (introduced in Telepresence 2.21) to`teleprence connect`. When using this
flag, Telepresence will take the following actions:

An alternative to allowing the conflict is to remap the cluster's CIDRs to virtual CIRDs
on the workstation by passing a `--proxy-via` flag to `teleprence connect`.
- The local DNS-server will translate any IP contained in a VNAT subnet to a virtual IP.
- All access to a virtual IP will be translated back to its original when routed to the cluster.
- The container environment retrieved when using `ingest` or `intercept` will be mangled, so that all IPs contained
in VNAT subnets are replaced with corresponding virtual IPs.

The `telepresence connect` flag `--proxy-via`, introduced in Telepresence 2.19, will allow the local DNS-server to translate cluster subnets to virtual subnets on the workstation, and the VIF to do the reverse translation. The syntax for this new flag, which can be repeated, is:
The `--vnat` flag can be repeated to make Telepresence translate more than one subnet.

```console
$ telepresence connect --proxy-via CIDR=WORKLOAD
$ telepresence connect --vnat CIDR
```
Cluster DNS responses matching CIDR to virtual IPs that are routed (with reverse translation) via WORKLOAD. The CIDR can also be a symbolic name that identifies a subnet or list of subnets:
The CIDR can also be a symbolic name that identifies a well-known subnet or list of subnets:

| Symbol | Meaning |
|-----------|-------------------------------------|
Expand All @@ -108,38 +91,128 @@ Cluster DNS responses matching CIDR to virtual IPs that are routed (with reverse
| `pods` | The cluster's pod subnets. |
| `all` | All of the above. |

The WORKLOAD is the deployment, replicaset, statefulset, or argo-rollout in the cluster whose agent will be used for targeting the routed subnets.

This is useful in two situations:
### Virtual Subnet Configuration

1. The cluster's subnets collide with subnets otherwise available on the workstation. This is common when using a VPN, in particular if the VPN has a small subnet mask, making the subnet itself very large. The new `--proxy-via` flag can be used as an alternative to [allowing the conflict](#allowing-the-conflict) to take place, give Telepresence precedence, and thus hide the corresponding subnets from the conflicting subnet. The `--proxy-via` will instead reroute the cluster's subnet and hence, avoid the conflict.
2. The cluster's DNS is configured with domains that resolve to loop-back addresses (this is sometimes the case when the cluster uses a mesh configured to listen to a loopback address and then reroute from there). A loop-back address is not useful on the client, but the `--proxy-via` can reroute the loop-back address to a virtual IP that the client can use.
Telepresence will use a special subnet when it generates the virtual IPs that are used locally. On a Linux or macOS
workstation, this subnet will be a class E subnet (not normally used for any other purposes). On Windows, the class E is
not routed, and Telepresence will instead default to `211.55.48.0/20`.

Subnet proxying is done by the client's DNS-resolver which translates the IPs returned by the cluster's DNS resolver to a virtual IP (VIP) to use on the client. Telepresence's VIF will detect when the VIP is used, and translate it back to the loop-back address on the pod.
The default subnet used can be overridden in the client configuration.

In `config.yml` on the workstation:
```yaml
routing:
virtualSubnet: 100.10.20.0/24
```
Or as a Helm chart value to be applied on all clients:
```yaml
client:
routing:
virtualSubnet: 100.10.20.0/24
```
#### Example
Let's assume that we have a conflict between the cluster's subnets, all covered by the CIDR `10.124.0.0/9` and a VPN
using `10.0.0.0/9`. We avoid the conflict using:

```console
$ telepresence connect --vnat all
```

The cluster's subnets are now hidden behind a virtual subnet, and the resulting configuration will look like this:

#### Proxy-via and using IP-addresses directly
![VPN Telepresence](../images/vpn-vnat.jpg)

If the service is using IP-addresses instead of domain-names when connecting to other cluster resources, then such connections will fail when running locally. The `--proxy-via` relies on the local DNS-server to translate the cluster's DNS responses, so that the IP of an `A` or `AAAA` response is replaced with a virtual IP from the configured subnet. If connections are made using an IP instead of a domain-name, then no such lookup is made. Telepresence has no way of detecting the direct use of IP-addresses.
### Proxying via a specific workload

#### Virtual IP Configuration
Telepresence is capable of routing all traffic to a VNAT to a specific workload. This is particularly useful when the
cluster's DNS is configured with domains that resolve to loop-back addresses. This is sometimes the case when the
cluster uses a mesh configured to listen to a loopback address and then reroute from there.

Telepresence will use a special subnet when it generates the virtual IPs that are used locally. On a Linux or macOS workstation, this subnet will be
a class E subnet (not normally used for any other purposes). On Windows, the class E is not routed, and Telepresence will instead default to `211.55.48.0/20`.
The `--proxy-via` flag (introduced in Telepresenc 2.19) is similar to `--vnat`, but the argument must be in the form
CIDR=WORKLOAD. When using this flag, all traffic to the given CIDR will be routed via the given workstation.

The default can be changed using the configuration `cluster.virtualIPSubnet`.
The WORKLOAD is the deployment, replicaset, statefulset, or argo-rollout in the cluster whose traffic-agent will be used
for targeting the routed subnets.

#### Example

Let's assume that we have a conflict between the cluster's subnets, all covered by the CIDR `10.124.0.0/9` and a VPN using `10.0.0.0/9`. We avoid the conflict using:
Let's assume that we have a conflict between the cluster's subnets, all covered by the CIDR `10.124.0.0/9` and a VPN
using `10.0.0.0/9`. We avoid the conflict using:

```console
$ telepresence connect --proxy-via all=echo
```

The cluster's subnets are now hidden behind a virtual subnet, and the resulting configuration will look like this:
The cluster's subnets are now hidden behind a virtual subnet, and all traffic is routed to the echo workload.

### Caveats when using VNAT

Telepresence may not accurately detect cluster-side IP addresses being used by services running locally on a workstation
in certain scenarios. This limitation arises when local services obtain IP addresses from remote sources such as
databases or configmaps, or when IP addresses are sent to it in API calls.

### Disabling default VNAT

The default behavior of using VNAT to resolve conflicts can be disabled by adding the following to the client config.

In `config.yml` on the workstation:
```yaml
routing:
autoResolveConflicts: false
```

Or as a Helm chart value to be applied on all clients:
```yaml
client:
routing:
autoResolveConflicts: false
```

Explicitly allowing all conflicts will also effectively prevent the default VNAT behavior.

## Allowing the conflict

A conflict can be resolved by carefully considering what your network layout looks like, and then allow Telepresence to
override the conflicting subnets. Telepresence is refusing to map them, because mapping them could render certain hosts
that are inside the VPN completely unreachable. However, you (or your network admin) know better than anyone how hosts
are spread out inside your VPN.

![VPN Telepresence](../images/vpn-proxy-via.jpg)
Even if the private route routes ALL of `10.0.0.0/8`, it's possible that hosts are only being spun up in one of the
sub-blocks of the `/8` space. Let's say, for example, that you happen to know that all your hosts in the VPN are bunched
up in the first half of the space -- `10.0.0.0/9` (and that you know that any new hosts will only be assigned IP
addresses from the `/9` block). In this case you can configure Telepresence to override the other half of this CIDR
block, which is where the services and pods happen to be.

To do this, all you have to do is configure the `client.routing.allowConflictingSubnets` flag in the Telepresence helm
chart. You can do this directly via `telepresence helm upgrade`:

In `config.yml` on the workstation:
```yaml
routing:
allowConflictingSubnets: 10.128.0.0/9
```

Or as a Helm chart configuration value to be applied on all clients:
```yaml
client:
routing:
allowConflictingSubnets: 10.128.0.0/9
```

Or pass the Helm chart configuration using the `--set` flag
```console
$ telepresence helm upgrade --set client.routing.allowConflictingSubnets="{10.128.0.0/9}"
```

The end result of this (assuming an allowlist of `/9`) will be a configuration like this:

![VPN Telepresence](../images/vpn-with-tele.jpg)

### Using docker

Use `telepresence connect --docker` to make the Telepresence daemon containerized, which means that it has its own network configuration and therefore no conflict with a VPN. Read more about docker [here](docker-run.md).
Use `telepresence connect --docker` to make the Telepresence daemon containerized, which means that it has its own
network configuration and therefore no conflict with a VPN. Read more about docker [here](docker-run.md).
Loading

0 comments on commit 45d59af

Please sign in to comment.