Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BGP Confederation support #6567

Open
notsrch opened this issue Jul 29, 2024 · 4 comments
Open

Add BGP Confederation support #6567

notsrch opened this issue Jul 29, 2024 · 4 comments
Assignees
Labels
area/transit/bgp Issues or PRs related to BGP support. kind/feature Categorizes issue or PR as related to a new feature. reported-by/end-user Issues reported by end users.

Comments

@notsrch
Copy link

notsrch commented Jul 29, 2024

Describe the problem/challenge you have
In large iBGP based networks it is useful to split up the AS into multiple sub-AS using BGP Confederations. Lack of confederation prevents the ability to segment the BGP config and failure domains.

Take for example our environment:
-Multiple large subnets of k8s nodes (each subnet is typically /20 or larger)
-BGP route reflectors in each subnet running as a sub-AS
-Multiple subnets are sub-AS of a main confederated AS
-Upstream Cisco router as gateway for all subnets (cannot run multiple BGP processes with a different local-AS for each process)

Without BGP confederation, either:

  1. All subnets use the same local-AS and peer with the same centralized BGP route reflectors. Single large failure domain.
  2. Or all subnets use the same local-AS with route reflectors local in each subnet. Causes a n*(n-1) peering in-between route reflectors due to full mesh requirements
  3. Or peering must rely on eBGP peering between route reflectors and upstream router which causes slower/staggered convergence due to the eBGP hold-down timers.
  4. Or must create separate VRF for each subnet on the upstream router and rely on route leaking between VRF. Less then ideal as it typically relies MP-BGP functionality which otherwise may not need enabled or used.

Describe the solution you'd like
Add support for BGP Confederation

Anything else you would like to add?
Sample frr-k8s config with basic BGP confederation:
64512 is the sub-AS
65535 is the the main AS

router bgp 64512
  bgp confederation identifier 65535
  neighbor 192.168.0.11 remote-as 64512
  neighbor 192.168.0.12 remote-as 64512
  neighbor 192.168.0.13 remote-as 64512
  !
  address-family ipv4 unicast
    neighbor 192.168.0.11 activate
    neighbor 192.168.0.12 activate
    neighbor 192.168.0.13 activate
  exit-address-family

BGP peering diagram

image

@notsrch notsrch added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 29, 2024
@antoninbas
Copy link
Contributor

Thanks for the detailed issue, we will look into it.
Since you seem pretty familiar with BGP confederation, would you consider proposing some changes to the BGPPolicy API in this issue - changes that would be necessary to enable the feature in Antrea?
Even some YAML examples of what BGPPolicy resources would ideally look like for your use case would be very useful IMO.

@antoninbas antoninbas added reported-by/end-user Issues reported by end users. area/transit/bgp Issues or PRs related to BGP support. labels Jul 29, 2024
@notsrch
Copy link
Author

notsrch commented Jul 29, 2024

Basing on the gobgp sample given here:
https://github.com/osrg/gobgp/blob/master/docs/sources/bgp-confederation.md

A sample BGPPolicy which is roughly equivalent to the above FRR config.

apiVersion: crd.antrea.io/v1alpha1
kind: BGPPolicy
metadata:
  name: advertise-all-ips
spec:
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker: "true"
  localASN: 64512

  confederation:
    enabled: true
    identifier: 65535
    memberaslist:
      - 65534
      - 64513
      - 64514
      - 64515

  advertisements:
    service:
      ipTypes: [LoadBalancerIP, ExternalIP]
    egress: {}
  bgpPeers:
    - address: 192.168.0.11
      asn: 64512
    - address: 192.168.0.12
      asn: 64512
    - address: 192.168.0.13
      asn: 64512

I am unsure the purpose of the gobgp member-as-list as the network equipment I am familiar with does not have this, but the example above is assuming it is required and basing it off of the same BGP diagram given in the original comment. In theory member-as-list can make sense on the receiving side to determine which AS was a sub-AS in a confederation vs not, but being that Antrea BGP in its implementation is essentially send only I'm not sure that even if configured there would be any effect.

@antoninbas
Copy link
Contributor

That seems pretty straightforward. We can easily extend the BGPPolicy API and in turn transform it into the appropriate gobgp configuration.
My only concern would be testing. Obviously we can easily verify that we are making the correct API calls to gobgp based on the contents of the BGPPolicy object. For e2e testing, we use a single "external" FRR router and we peer Antrea Nodes to it. I wonder if we can make a meaningful "confederation" test with such a topology. But maybe it is not truly necessary, as long as we can ensure that we configure gobgp correctly and assume it is doing the right thing.

BTW, currently we restrict the localASN for Antrea Nodes to be in the private ASN range (although I don't know if we have a valid reason to do so). Should we have the same restriction for the "main" confederation AS? In your case, it seems that you also use a number in the private range there, so maybe it's not a concern. But maybe you have any insights?

cc @hongliangl, who contributed the BGP support in Antrea. He is currently on leave but will be back in a couple of weeks.

@notsrch
Copy link
Author

notsrch commented Jul 30, 2024

We use FRR for the route reflectors in the diagram.
For a stripped standalone FRR config, this might work.

Current configuration:
!
frr version 7.5.1
frr defaults traditional
!
router bgp 64512
 bgp router-id 192.168.0.11
 bgp confederation identifier 65535
 neighbor k8s peer-group
 neighbor k8s remote-as 64512
 bgp listen range 192.168.0.0/24 peer-group k8s
 !
 address-family ipv4 unicast
  !
 exit-address-family
!
line vty
!
end

For prefix 10.0.0.34/34 advertised from the k8s cluster (Antrea BGP or FRR-k8s), the confed-internal on second to last line means the prefix learned is internal to the confederated AS. This is probably about as far as you can test with a single standalone FRR peer.

router# sh ip bgp ipv4 10.0.0.34/32
BGP routing table entry for 10.0.0.34/32
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  192.168.0.2 192.168.0.3 192.168.0.12 192.168.0.13
  Local, (Received from a RR-client)
    192.168.0.240 from 192.168.0.240 (100.64.1.1)
      Origin IGP, metric 0, localpref 100, valid, confed-internal, best (First path received)
      Last update: Fri Jul  5 15:44:33 2024

BTW, currently we restrict the localASN for Antrea Nodes to be in the private ASN range (although I don't know if we have a valid reason to do so). Should we have the same restriction for the "main" confederation AS? In your case, it seems that you also use a number in the private range there, so maybe it's not a concern. But maybe you have any insights?

Our use case for this is internal (non direct internet peering), so we use all private ranges. The restriction would help protect unknowing users against bad config and unwise decisions. However, a k8s cluster peering or operating on the public side (internet) intentionally would likely want to use public ASN ranges in both the localASN and/or the confederated ASN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/transit/bgp Issues or PRs related to BGP support. kind/feature Categorizes issue or PR as related to a new feature. reported-by/end-user Issues reported by end users.
Projects
None yet
Development

No branches or pull requests

3 participants