Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "bgpd: Do not send Deconfig/Shutdown message when restarting" #18262

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Pdoijode
Copy link
Contributor

This reverts commit 95098d9.

This commit was merged in order to address the issue mentioned in #12030. However, in FRR, for graceful restart procedure to kick in, BGP must to be terminated with SIGKILL (which is what GR topotests do as well)

if user executes systemctl stop frr then BGP must send NOTIFICATION to its peers and exit cleanly.

@Pdoijode Pdoijode force-pushed the pdoijode/revert-gr-change branch from 8dcb15d to 0210ce1 Compare February 25, 2025 23:23
Copy link
Member

@ton31337 ton31337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree here until we have a way to control this via systemd or so to send a SIGKILL instead of SIGINT. Most people are using routing on the host (running FRR on every server at scale) and expecting to restart the nodes gracefully (via systemctl restart frr) instead of killing them.

@Pdoijode
Copy link
Contributor Author

Pdoijode commented Feb 26, 2025

In FRR, the default GR mode is helper! So even when there's no graceful-restart configured, FRR will be in helper mode and GR capability will be exchanged. This basically means that BGP_PEER_GRACEFUL_RESTART_CAPABLE will always return true in the following function peer_notify_shutdown().. So everytime a user executes systemctl stop/restart frr, BGP will NOT send NOTIFICATION message to its peers which goes against what's mentioned in https://datatracker.ietf.org/doc/html/rfc4486#section-4:

https://datatracker.ietf.org/doc/html/rfc4486#section-4:

If a BGP speaker decides to administratively shut down its peering
   with a neighbor, then the speaker SHOULD send a NOTIFICATION message
   with the Error Code Cease and the Error Subcode "Administrative
   Shutdown".

current FRR code :

#define BGP_PEER_GRACEFUL_RESTART_CAPABLE(peer)                                \
        (CHECK_FLAG(peer->cap, PEER_CAP_RESTART_ADV)                           \
         && CHECK_FLAG(peer->cap, PEER_CAP_RESTART_RCV))

static void peer_notify_shutdown(struct peer *peer)
{
	if (BGP_PEER_GRACEFUL_RESTART_CAPABLE(peer)) {
		if (bgp_debug_neighbor_events(peer))
			zlog_debug(
				"%pBP configured Graceful-Restart, skipping shutdown notification",
				peer);
		return;
	}

	if (BGP_IS_VALID_STATE_FOR_NOTIF(peer->connection->status))
		bgp_notify_send(peer->connection, BGP_NOTIFY_CEASE,
				BGP_NOTIFY_CEASE_ADMIN_SHUTDOWN);
}

example:

test:
bgp_addpath_best_selected/test_bgp_addpath_best_selected.py(140)test_bgp_addpath_best_selected()


r1# show running-config 
Building configuration...

Current configuration:
!
frr version 10.4-dev-MyOwnFRRVersion
frr defaults traditional
hostname r1
log file /tmp/topotests/bgp_addpath_best_selected.test_bgp_addpath_best_selected/r1/bgpd-stop.log
log syslog
log timestamp precision 6
log commands
no service integrated-vtysh-config
!
debug bgp neighbor-events
debug bgp graceful-restart
!
interface r1-eth0
 ip address 192.168.1.1/24
exit
!
router bgp 65001
 no bgp ebgp-requires-policy
 timers bgp 3 10
 neighbor 192.168.1.2 remote-as external
 neighbor 192.168.1.2 timers connect 5
exit
!
end
r1#

r1# show bgp neighbors 192.168.1.2
BGP neighbor is 192.168.1.2, remote AS 65002, local AS 65001, external link
  Local Role: undefined
  Remote Role: undefined
Hostname: r2
  BGP version 4, remote router ID 192.168.7.2, local router ID 192.168.1.1
  BGP state = Established, up for 00:56:41
  Last read 00:00:02, Last write 00:00:02
  Hold time is 10 seconds, keepalive interval is 3 seconds
  Configured hold time is 10 seconds, keepalive interval is 3 seconds
  Configured tcp-mss is 0, synced tcp-mss is 1448
  Configured conditional advertisements interval is 60 seconds
  Neighbor capabilities:
    4 Byte AS: advertised and received
    Extended Message: advertised and received
    AddPath:
      IPv4 Unicast: TX received
      IPv4 Unicast: RX advertised and received
    Paths-Limit:
      IPv4 Unicast: advertised (0) and received (0)
    Long-lived Graceful Restart: advertised and received
      Address families by peer:
    Route refresh: advertised and received
    Enhanced Route Refresh: advertised and received
    Address Family IPv4 Unicast: advertised and received
    Hostname Capability: advertised (name: r1,domain name: n/a) received (name: r2,domain name: n/a)
    Version Capability: not advertised not received
    Link-Local Next Hop Capability: not advertised not received
    Graceful Restart Capability: advertised and received ========> GR cap adv and rcvd even without any GR config!
      Remote Restart timer is 120 seconds
      Address families by peer:
        none
  Graceful restart information:
    End-of-RIB send: IPv4 Unicast
    End-of-RIB received: IPv4 Unicast
    Local GR Mode: Helper* ===========> Default mode is global helper!
    Remote GR Mode: Helper
    R bit: True
    N bit: True
    Timers:
      Configured Restart Time(sec): 120
      Received Restart Time(sec): 120
      Configured LLGR Stale Path Time(sec): 0
    IPv4 Unicast:
      F bit: False
      End-of-RIB sent: Yes
      End-of-RIB sent after update: No
      End-of-RIB received: Yes
      Timers:
        Configured Stale Path Time(sec): 360
        LLGR Stale Path Time(sec): 0
  <SNIP>

Logs on another node r3 (different topo) where systemctl stop frr was executed. This node has no GR related configs and it's in global heper mode (since that's the default). From the logs, we can see that the NOTIFICATION message to peer is being skipped!

on r3:

2025/02/26 19:26:25.394462 ZEBRA: [PR3CF-TPMEZ] frr_sigevent_process: calling daemon handlers
2025/02/26 19:26:25.394508 ZEBRA: [SJE5T-NEPQ3] Handling SIGINT
2025/02/26 19:26:25.394514 ZEBRA: [XVBTQ-5QTVQ] Terminating on signal
2025/02/26 19:26:25.395458 MGMTD: [PR3CF-TPMEZ] frr_sigevent_process: calling daemon handlers
2025/02/26 19:26:25.395483 MGMTD: [J2RAS-MZ95C] Terminating on signal
2025/02/26 19:26:25.396742 STATIC: [X3G8F-PM93W] BE-client: mgmt_msg_read: got EOF/disconnect
2025/02/26 19:26:25.397880 BGP: [PR3CF-TPMEZ] frr_sigevent_process: calling daemon handlers
2025/02/26 19:26:25.397912 BGP: [ZW1GY-R46JE] Terminating on signal
2025/02/26 19:26:25.398944 STATIC: [PR3CF-TPMEZ] frr_sigevent_process: calling daemon handlers
2025/02/26 19:26:25.398959 STATIC: [MRN6F-AYZC4] Terminating on signal
2025/02/26 19:26:25.399085 ZEBRA: [YDZ55-W3VM6] release_daemon_table_chunks: Released 0 table chunks
2025/02/26 19:26:25.399516 ZEBRA: [JPSA8-5KYEA] client 50 disconnected 15 bgp routes removed from the rib
2025/02/26 19:26:25.399566 ZEBRA: [S929C-NZR3N] client 50 disconnected 0 bgp nhgs removed from the rib
2025/02/26 19:26:25.400359 BGP: [WRF9C-7WABP] BGP Terminating
2025/02/26 19:26:25.400420 BGP: [NC2TR-VN5DQ] 10.1.3.1(r1) configured Graceful-Restart, skipping unconfig notification
2025/02/26 19:26:25.400432 BGP: [NC2TR-VN5DQ] 10.2.3.1(r2) configured Graceful-Restart, skipping unconfig notification
2025/02/26 19:26:25.400433 BGP: [NC2TR-VN5DQ] 10.3.5.2(r5) configured Graceful-Restart, skipping unconfig notification
2025/02/26 19:26:25.400434 BGP: [NC2TR-VN5DQ] 10.3.6.2(r6) configured Graceful-Restart, skipping unconfig notification ========> NOTIFICATION to peer is skipped!
2025/02/26 19:26:25.400484 BGP: [HSDTE-7K6WF] BGP exiting.

on r1:

root@r1:mgmt:/var/home/cumulus# tail -f /var/log/frr/bgpd.log

2025/02/26 19:26:25.419764 BGP: [NJ2F2-2W769] 10.1.3.2 [Event] BGP connection closed fd 54
2025/02/26 19:26:25.420506 BGP: [NTX3S-9Q8YV] 10.1.3.2 [Event] BGP error 5 on fd 54
2025/02/26 19:26:25.420581 BGP: [ZWCSR-M7FG9] 10.1.3.2 [FSM] TCP_connection_closed (Established->Clearing), fd 54
2025/02/26 19:26:25.420608 BGP: [PXVXG-TFNNT] %ADJCHANGE: neighbor 10.1.3.2(r3) in vrf default Down Peer closed the session
2025/02/26 19:26:25.421207 BGP: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: Clearing established_peers 1
2025/02/26 19:26:25.421353 BGP: [HKWM3-ZC5QP] 10.1.3.2 fd -1 went from Established to Clearing
2025/02/26 19:26:25.431412 BGP: [ZWCSR-M7FG9] 10.1.3.2 [FSM] Clearing_Completed (Clearing->Idle), fd -1
2025/02/26 19:26:25.431597 BGP: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: Idle established_peers 1
2025/02/26 19:26:25.431601 BGP: [HKWM3-ZC5QP] 10.1.3.2 fd -1 went from Clearing to Idle

config on r3:

root@r3:mgmt:/var/home/cumulus# vtysh 

Hello, this is FRRouting (version 10.0.3).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

r3# show running-config 
Building configuration...

Current configuration:
!
hostname r3
log file /var/log/frr/bgpd.log
log syslog
log timestamp precision 6
service integrated-vtysh-config
!
debug bgp neighbor-events
debug bgp graceful-restart
!
vrf mgmt
exit-vrf
!
interface swp1
 description swp1 -> r1's swp1
exit
!
interface swp2
 description swp2 -> r2's swp1
exit
!
interface swp3
 description swp3 -> r5's swp1
exit
!
interface swp4
 description swp4 -> r6's swp1
exit
!
router bgp 65003
 bgp router-id 10.0.0.3
 neighbor EBGP peer-group
 neighbor EBGP ttl-security hops 1
 neighbor EBGP advertisement-interval 0
 neighbor EBGP timers 3 9
 neighbor EBGP timers connect 1
 neighbor 10.1.3.1 remote-as 65001
 neighbor 10.1.3.1 peer-group EBGP
 no neighbor 10.1.3.1 capability dynamic
 neighbor 10.2.3.1 remote-as 65002
 neighbor 10.2.3.1 peer-group EBGP
 no neighbor 10.2.3.1 capability dynamic
 neighbor 10.3.5.2 remote-as 65005
 neighbor 10.3.5.2 peer-group EBGP
 no neighbor 10.3.5.2 capability dynamic
 neighbor 10.3.6.2 remote-as 65006
 neighbor 10.3.6.2 peer-group EBGP
 no neighbor 10.3.6.2 capability dynamic
 !
 address-family ipv4 unicast
  network 10.0.0.3/32
  network 10.1.3.0/24
  network 10.2.3.0/24
  network 10.3.5.0/24
  network 10.3.6.0/24
 exit-address-family
exit
!
line vty
 exec-timeout 0 0
exit
!
end
r3# 

@Pdoijode
Copy link
Contributor Author

Pdoijode commented Feb 26, 2025

IMO, we need to revert #12034 for now and let users SIGKILL BGP to initiate GR procedure until systemctl stop frr is enhanced to send SIGKILL

@ton31337
Copy link
Member

We should, but right now stop does not use SIGKILL.

@donaldsharp
Copy link
Member

I don't think we've actually documented the proper procedure for using GR? If not then I'm not sure arguing about using stop is correct at all.

@Pdoijode
Copy link
Contributor Author

Pdoijode commented Feb 26, 2025

This is what I found in https://docs.frrouting.org/en/latest/bgp.html#graceful-restart. It says Kill BGP Process at R1. I assume kill here means SIGKILL? Since that's what our topotests do as well.. i.e frr GR topotests use "kill -9" to initiate GR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants