Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomad Task using wrong interface for egress connection #24279

Open
daniel-tihanyi opened this issue Oct 23, 2024 · 1 comment
Open

Nomad Task using wrong interface for egress connection #24279

daniel-tihanyi opened this issue Oct 23, 2024 · 1 comment

Comments

@daniel-tihanyi
Copy link

Hello,

I seem to have an issue with egress connection for Nomad tasks.

Nomad version

1.8.4

Operating system and Environment details

Ubuntu 24.04 with multiple VLAN interfaces and VRFs.

Issue

TL;DR: Nomad is bridging the task to the wrong interface when a task is initiating an egress connection.
When I deploy a task with Nomad and configure the network using the "network" stanza, while a bridge is set up for the incoming connection correctly and an ingress connection works, the egress connection uses the wrong interface. In our infrastructure, there are multiple VLANs switched to the same server on a trunk interface. On the server, there are multiple vlan interfaces and VRFs set up in order to make sure, that we have multiple network stacks that are independent from each other and cannot be routed internally.
So far I've seen the cni plugins provide a lot of functionality, but given that our jobs and tasks are using Consul Connect proxies, unfortunately this is not a usable workaround.

Reproduction steps

The idea is to create Docker containers with Nomad, that are bridged to either of those VLAN interfaces. Both ingress and egress traffic should use the specified VLAN interface.
The interfaces are set up in nomad.hcl:

client {
  enabled = true
  host_network "vlan1" {
    interface = "vlan1"
  }
  host_network "vlan2" {
    interface = "vlan2"
  }
}

Specifying the network in the Jobspec looks like this:

network {
  mode = "bridge"
  port "container_port" {
    host_network = vlan2
  }
}

Expected Result

After starting the job, if I initiate a network connection from the container, I'd expect that the egress connection will be bridged to the interface vlan2.

Actual Result

Egress connection uses the interface where a default gateway is defined.

@Juanadelacuesta
Copy link
Member

Juanadelacuesta commented Oct 24, 2024

Hello @daniel-tihanyi, thank you for bringing this to our attention.
It looks like there is a little misunderstanding here, the gateway used for your egress connections is configured separately, as part of the vlan. If none is configured it will use the default one, which seems to be what you are seeing, here is the docker bridge docs for more information: https://docs.docker.com/engine/network/drivers/bridge/ If you already have a gateway configured and are still seeing the same behaviour, can you please share some come of your configuration so we can try to reproduce it? Thank you!

@Juanadelacuesta Juanadelacuesta moved this from Needs Triage to Needs Roadmapping in Nomad - Community Issues Triage Oct 24, 2024
@Juanadelacuesta Juanadelacuesta moved this from Needs Roadmapping to In Progress in Nomad - Community Issues Triage Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

2 participants