-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathdraft-ietf-lsvr-bgp-spf.xml
1927 lines (1904 loc) · 93.7 KB
/
draft-ietf-lsvr-bgp-spf.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc [
<!ENTITY nbsp " ">
<!ENTITY zwsp "​">
<!ENTITY nbhy "‑">
<!ENTITY wj "⁠">
]>
<!-- This template is for creating an Internet Draft using xml2rfc,
which is available here: http://xml.resource.org. -->
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs),
please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that
most I-Ds might want to use.
(Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space
(using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="std" docName="draft-ietf-lsvr-bgp-spf-51"
ipr="trust200902" obsoletes="" updates="" submissionType="IETF" xml:lang="en" tocInclude="true"
tocDepth="4" symRefs="true" sortRefs="true" version="3" consensus="true">
<!-- xml2rfc v2v3 conversion 3.12.1 -->
<!-- category values: std, bcp, info, exp, and historic
ipr values: full3667, noModification3667, noDerivatives3667
you can add the attributes updates="NNNN" and obsoletes="NNNN"
they will automatically be output with "(if approved)" -->
<!-- ***** FRONT MATTER ***** -->
<front>
<title abbrev="BGP Link-State SPF Routing">
BGP Link-State Shortest Path First (SPF) Routing</title>
<!-- add 'role="editor"' below for the editors if appropriate -->
<!-- Another author who claims to be an editor -->
<author fullname="Keyur Patel" initials="K" surname="Patel">
<organization>Arrcus, Inc.</organization>
<address>
<email>[email protected]</email>
</address>
</author>
<author fullname="Acee Lindem" initials="A" surname="Lindem">
<organization>LabN Consulting, LLC</organization>
<address>
<postal>
<street>301 Midenhall Way</street>
<city>Cary</city>
<region>NC</region>
<code>27513</code>
<country>USA</country>
</postal>
<email>[email protected]</email>
</address>
</author>
<author fullname="Shawn Zandi" initials="S" surname="Zandi">
<organization>LinkedIn</organization>
<address>
<postal>
<street>222 2nd Street</street>
<city>San Francisco</city>
<region>CA</region>
<code>94105</code>
<country>USA</country>
</postal>
<email>[email protected]</email>
</address>
</author>
<author fullname="Wim Henderickx" initials="W" surname="Henderickx">
<organization>Nokia</organization>
<address>
<postal>
<street>copernicuslaan 50</street>
<city>Antwerp</city>
<code>2018</code>
<country>Belgium</country>
</postal>
<email>[email protected]</email>
</address>
</author>
<date/>
<!-- Meta-data Declarations -->
<area>Routing</area>
<workgroup>Link State Vector Routing (LSVR) Working Group</workgroup>
<keyword>IDR</keyword>
<!-- Keywords will be incorporated into HTML output
files in a meta tag but they have no effect on text or nroff
output. If you submit your draft to the RFC Editor, the
keywords will be used for the search engine. -->
<abstract>
<t>
Many Massively Scaled Data Centers (MSDCs) have converged on simplified
L3 (Layer 3) routing. Furthermore, requirements for operational simplicity
has led many of these MSDCs to converge on BGP as their single routing
protocol for both their fabric routing and their Data Center Interconnect
(DCI) routing. This document describes extensions to BGP to use BGP
Link-State distribution and the Shortest Path First (SPF) algorithm.
In doing this, it allows
BGP to be efficiently used as both the underlay protocol and the overlay protocol in
MSDCs.
</t>
</abstract>
</front>
<middle>
<section anchor="introduction" numbered="true" toc="default">
<name>Introduction</name>
<t>
Many Massively Scaled Data Centers (MSDCs) have converged on simplified
L3 (Layer 3) routing. Furthermore, requirements for operational simplicity
has led many of these MSDCs to converge on BGP <xref target="RFC4271" format="default"/>
as their single routing protocol for both their fabric routing and
their Data Center Interconnect (DCI) routing <xref target="RFC7938" format="default"/>.
This document describes an alternative solution which leverages
BGP-LS <xref target="RFC9552" format="default"/>
and the Shortest Path First algorithm used by
Internal Gateway Protocols (IGPs).
</t>
<t>
This document leverages both the BGP protocol <xref target="RFC4271" format="default"/> and
the BGP-LS <xref target="RFC9552" format="default"/> extensions. The relationship, as well as
the scope of changes is described respectively in <xref target="BGP-base" format="default"/>
and <xref target="BGP-LS" format="default"/>. The modifications to
<xref target="RFC4271" format="default"/>
for BGP SPF described herein only apply to IPv4 and IPv6 as underlay unicast
Subsequent Address Families Identifiers (SAFIs).
Operations for any other BGP SAFIs are outside the scope of this document.
</t>
<t>
This solution avails the benefits of both BGP and SPF-based IGPs.
These include TCP-based flow-control, no periodic link-state refresh, and
completely incremental NLRI advertisement. These advantages can reduce the
overhead in MSDCs where there is a high degree of Equal Cost Multi-Path
(ECMP) load-balancing.
Additionally, using an SPF-based computation can support fast convergence and
the computation of Loop-Free Alternatives (LFAs). The SPF LFA extensions defined
in <xref target="RFC5286" format="default"/> can be similarly applied to BGP SPF calculations.
However, the details are a matter of implementation detail and out of scope for this
document.
Furthermore, a BGP-based solution lends itself to multiple peering models
including those incorporating route-reflectors <xref target="RFC4456" format="default"/>
or controllers.
</t>
<section anchor="terms" numbered="true" toc="default">
<name>Terminology</name>
<t>
This specification reuses terms defined in section 1.1 of
<xref target="RFC4271" format="default"/>.
</t>
<t>Additionally, this document introduces the following terms:
</t>
<dl newline="false" spacing="normal">
<dt>BGP SPF Routing Domain:</dt>
<dd> A set of BGP routers that are under a single
administrative domain and exchange link-state information using the BGP-LS-SPF SAFI
and compute routes using BGP SPF as described herein.</dd>
<dt>BGP-LS-SPF NLRI:</dt>
<dd> This refers to BGP-LS Network Layer Reachability
Information (NLRI) that is being advertised in the BGP-LS-SPF SAFI (<xref target="SAFI" format="default"/>)
and is being used for BGP SPF route computation.</dd>
<dt>Dijkstra Algorithm:</dt>
<dd>
An algorithm for computing the shortest path from a given node in a graph
to every other node in the graph.
</dd>
<dt>Prefix NLRI:</dt>
<dd>
In the context of BGP SPF, this term refers to both or either the IPv4 Topology Prefix NLRI
and/or the IPv6 Topology Prefix NLRI.
</dd>
</dl>
</section>
<section numbered="true" toc="default">
<name>BGP Shortest Path First (SPF) Motivation</name>
<t>
Given that <xref target="RFC7938" format="default"/> already describes how BGP could be used
as the sole routing protocol in an MSDC, one might question the motivation for
defining an alternative BGP deployment model when a mature solution exists.
For both alternatives, BGP offers the operational benefits of a single
routing protocol as opposed to the combination of an IGP for the underlay
and BGP as an overlay. However, BGP SPF offers some unique advantages above
and beyond standard BGP path-vector routing. With BGP SPF, the
simple single-hop peering model recommended in section 5.2.1 of <xref target="RFC7938"/>
is augmented with peering models requiring fewer BGP sessions.
</t>
<t>
A primary advantage is that all BGP speakers in the BGP SPF routing domain
have a complete view of the topology. This allows support for ECMP,
IP fast-reroute (e.g., Loop-Free Alternatives) <xref target="RFC5286" format="default"/>,
Shared Risk Link Groups (SRLGs) <xref target="RFC4202"/>,
and other routing enhancements without advertisement of additional
BGP paths <xref target="RFC7911" format="default"/> or other extensions.
</t>
<t>
With the BGP SPF decision process as defined in
<xref target="bgp-decision" format="default"/>, NLRI changes can be disseminated throughout the BGP
routing domain much more rapidly. The added advantage of BGP using TCP for reliable
transport leverages TCP's inherent flow-control and guaranteed in-order delivery.
</t>
<t>
Another primary advantage is a potential reduction in NLRI advertisement.
With standard BGP path-vector routing, a single link failure may impact
100s or 1000s of prefixes and result in the withdrawal or re-advertisement of
the attendant NLRI. With BGP SPF, only the BGP speakers originating
the link NLRI need to withdraw the corresponding BGP-LS-SPF Link NLRI. Additionally,
the changed NLRI is advertised immediately as opposed to normal BGP where it
is only advertised after the best route selection. These advantages provide
NLRI dissemination throughout the BGP SPF routing domain with efficiencies similar
to link-state protocols.
</t>
<t>
With controller and route-reflector peering models, BGP SPF advertisement
and distributed computation require a minimal number of sessions and
copies of the NLRI since only the latest version of the NLRI from the
originator is required (see <xref target="peering-models"/>).
Given that verification of whether or not to advertise a link (with a
BGP-LS-SPF Link NLRI) is done outside of BGP, each BGP
speaker only needs as many sessions and copies of the NLRI as required for
redundancy. Additionally, a controller could inject topology (i.e., BGP-LS-SPF NLRI)
that is learned outside the BGP SPF routing domain.
</t>
<t>
Given BGP-LS NLRI is already defined
<xref target="RFC9552" format="default"/>, this functionality
can be reused for BGP-LS-SPF NLRI.
</t>
<t>
Another advantage of BGP SPF is that both IPv6 and IPv4 can
be supported using the BGP-LS-SPF SAFI with the same BGP-LS-SPF Link NLRIs. In many
MSDC fabrics, the IPv4 and IPv6 topologies are congruent (refer to
<xref target="Link-NLRI" format="default"/>).
Although beyond the scope of this document, BGP-LS-SPF NLRI multi-topology extensions could
be defined to support separate IPv4, IPv6, unicast, and multicast topologies
while sharing the same NLRI.
</t>
<t>
Finally, the BGP SPF topology can be used as an underlay for other BGP
SAFIs (using the existing model) and realize all the above
advantages.
</t>
</section>
<section numbered="true" toc="default">
<name>Document Overview</name>
<t>
The document begins with sections defining the precise relationship that BGP SPF has
with both the base BGP protocol <xref target="RFC4271" format="default"/> (<xref target="BGP-base" format="default"/>) and the
BGP Link-State (BGP-LS) extensions <xref target="RFC9552" format="default"/>
(<xref target="BGP-LS" format="default"/>). The BGP peering models, as well as
their respective trade-offs are then discussed in
<xref target="peering-models" format="default"/>. The remaining sections, which make up the bulk of the
document, define the protocol enhancements necessary to support BGP SPF including BGP-LS Extensions
(<xref target="protocol-extend" format="default"/>), replacement of the base BGP decision process
with the SPF computation (<xref target="bgp-decision" format="default"/>), and BGP SPF error
handling (<xref target="error-handling" format="default"/>).
</t>
</section>
<section numbered="true" toc="default">
<name>Requirements Language</name>
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 <xref target="RFC2119" format="default"/> <xref target="RFC8174" format="default"/>
when, and only when, they appear in all capitals, as shown here.</t>
</section>
</section>
<!-- for Introductions section -->
<section anchor="BGP-base" numbered="true" toc="default">
<name>Base BGP Protocol Relationship</name>
<t>
With the exception of the decision process, the BGP SPF extensions leverage the BGP
protocol <xref target="RFC4271" format="default"/> without change. This includes the BGP protocol
Finite State Machine, BGP messages and their encodings, processing of BGP messages,
BGP attributes and path attributes, BGP NLRI encodings, and any error handling
defined in <xref target="RFC4271" format="default"/>, <xref target="RFC4760" format="default"/>,
and <xref target="RFC7606" format="default"/>.
</t>
<t>
Due to the changes to the decision
process, there are mechanisms and encodings that are no longer applicable.
Unless explicitly specified in the context of BGP SPF, all optional path
attributes SHOULD NOT be advertised. If received, all path attributes MUST
be accepted, validated, and propagated consistent with the BGP protocol
<xref target="RFC4271"/>, even if not needed by BGP SPF.
</t>
<t>
Section 9.1 of <xref target="RFC4271" format="default"/> defines the decision process that
is used to select routes for subsequent advertisement
by applying the policies in the local Policy Information Base (PIB) to the
routes stored in its Adj-RIBs-In. The output of the Decision Process is the
set of routes that are announced by a BGP speaker to its peers. These
selected routes are stored by a BGP speaker in the speaker's Adj-RIBs-Out
according to policy.
</t>
<t>
The BGP SPF extension fundamentally changes the decision process, as described
herein. Specifically:
</t>
<ol spacing="normal" type="1">
<li>
BGP advertisements are readvertised to neighbors immediately without waiting
or dependence on the route computation as specified in phase 3 of the base BGP
decision process. Multiple peering models are supported as specified in
<xref target="peering-models" format="default"/>.
</li>
<li>
Determining the degree of preference for BGP routes for the SPF calculation as
described in phase 1 of the base BGP decision process is replaced with the mechanisms
in <xref target="Phase-1" format="default"/>.
</li>
<li>
Phase 2 of the base BGP protocol decision process is replaced with the
Shortest Path First (SPF) algorithm, also known as the Dijkstra algorithm.
</li>
</ol>
</section>
<!-- for BGP relationship section -->
<section anchor="BGP-LS" numbered="true" toc="default">
<name>BGP Link-State (BGP-LS) Relationship</name>
<t>
<xref target="RFC9552" format="default"/> describes a mechanism by
which link-state and Traffic Engineering (TE) information can be collected from networks and shared with external
entities using BGP.
This is achieved by defining NLRI advertised using the BGP-LS AFI. The BGP-LS extensions defined in
<xref target="RFC9552" format="default"/> make use of the decision process defined in
<xref target="RFC4271" format="default"/>. Rather than reusing the BGP-LS SAFI, the BGP-LS-SPF SAFI
(<xref target="SAFI" format="default"/>) is introduced to ensure backward compatibility
for the BGP-LS SAFI usage.
</t>
<t>
The "BGP-LS NLRI and Attribute TLVs" registry <xref target="RFC9552"/> is shared between
the BGP-LS SAFI and the BGP-LS-SPF SAFI.
However, the TLVs defined in this document may not be applicable to
the BGP-LS SAFI. As specified in Section 5.1 of <xref target="RFC9552"/>, the presence
of unknown or unexpected TLVs is required to not result in the NLRI or
the BGP-LS Attribute being considered
malformed (section 5.2 of <xref target="RFC9552"/>). The list of BGP-LS TLVs applicable
to the BGP-LS-SPF SAFI are described in
<xref target="NLRI-Use"/>. By default, the usage of other BGP-LS TLVs or
extensions are ignored for the BGP-LS-SPF SAFI. However, this doesn't preclude the usage
specification of these TLVs for the BGP-LS-SPF SAFI in future documents.
</t>
</section>
<!-- for BGP-LS relationship section -->
<section anchor="peering-models" numbered="true" toc="default">
<name>BGP SPF Peering Models</name>
<t>
Depending on the topology, scaling, capabilities of the BGP speakers, and redundancy
requirements, various peering models are supported. The only requirement is that all BGP
speakers in the BGP SPF routing domain adhere to this specification.
</t>
<t>
The choice of the deployment model is up to the operator and their requirements and policies.
Deployment model choice is out of scope for this document and is discussed in
<xref target="I-D.ietf-lsvr-applicability" format="default"/>. The sub-sections below
describe several BGP SPF deployment models. However, this doesn't preclude other
deployment models.
</t>
<section anchor="single-hop-peering" numbered="true" toc="default">
<name>BGP Single-Hop Peering on Network Node Connections</name>
<t>
The simplest peering model is the one where
EBGP single-hop sessions are established over direct point-to-point links
interconnecting the nodes in the BGP SPF routing domain. Once the single-hop BGP session has been
established and the Multi-Protocol Extensions Capability with the BGP-LS-SPF AFI/SAFI has been exchanged
<xref target="RFC4760" format="default"/> for the corresponding session, then the link is considered up from
a BGP SPF perspective and the corresponding BGP-LS-SPF Link NLRI is advertised.
</t>
<t>
An End-of-RIB (EoR) Marker (<xref target="BGP-LS-SPF-EOR"/>) for the BGP-LS-SPF
SAFI MAY be required from a peer prior to advertising the BGP-LS-SPF Link NLRI
for the corresponding link to that peer. When required, the default
wait indefinitely for the EoR Marker prior to advertising the BGP-LS-SPF Link NLRI.
Refer to <xref target="Adjacency-EoR-Required"/>.
</t>
<t>
A failure to consistently configure the use of the EoR marker can
result in transient micro-loops and dropped traffic due to incomplete
forwarding state.
</t>
<t>
If the session goes down, the corresponding Link NLRI are withdrawn. Topologically,
this would be equivalent to the peering model in <xref target="RFC7938" format="default"/> where there
is a BGP session on every link in the data center switch fabric. The content of the Link NLRI
is described in <xref target="Link-NLRI" format="default"/>.
</t>
</section>
<section numbered="true" toc="default">
<name>BGP Peering Between Directly-Connected Nodes</name>
<t>
In this model, BGP speakers peer with all directly-connected
nodes but the sessions may be between loopback addresses (i.e.,
two-hop sessions) and the direct connection
discovery and liveness detection for the interconnecting links are
independent of the BGP protocol. The BFD protocol <xref target="RFC5880" format="default"/>
is RECOMMENDED for liveness detection. Usage of other liveness connection mechanisms
is outside the scope of this document.
Consequently, there is a single BGP session even if there are multiple
direct connections between BGP speakers. The BGP-LS-SPF Link NLRI is advertised
as long as a BGP session has been established, the BGP-LS-SPF AFI/SAFI
capability has been exchanged <xref target="RFC4760" format="default"/>,
the link is operational as determined using liveness detection mechanisms,
and, optionally, the EoR Marker has been received as described in the
<xref target="BGP-LS-SPF-EOR"/>.
This is much like the previous peering model only peering is between
loopback addresses and the interconnecting links can be unnumbered. However,
since there are BGP sessions between every directly-connected node in the
BGP SPF routing domain, there is a reduction in BGP sessions when there
are parallel links between nodes. Hence, this peering model is RECOMMENDED
over the single-hop peering model <xref target="single-hop-peering"/>.
</t>
</section>
<section numbered="true" toc="default">
<name>BGP Peering in Route-Reflector or Controller Topology</name>
<t>
In this model, BGP speakers peer solely with one or more Route Reflectors
<xref target="RFC4456" format="default"/> or controllers. As in the previous model, direct
connection discovery and liveness detection for those links in the BGP
SPF routing domain are done outside of the BGP protocol.
BGP-LS-SPF Link NLRI is advertised as long as the corresponding link is
considered up as per the chosen liveness detection mechanism (The BFD protocol
<xref target="RFC5880" format="default"/> is RECOMMENDED).
</t>
<t>
This peering model, known as sparse peering, allows for fewer BGP sessions
and, consequently, fewer instances of the same NLRI received from multiple peers.
Ideally, the route-reflectors or controller BGP sessions would be on directly-connected
links to avoid dependence on another routing protocol for session connectivity. However,
multi-hop peering is not precluded. The number of BGP sessions is dependent
on the redundancy requirements and the stability of the BGP sessions.
</t>
<t>
The controller may use constraints to determine
when to advertise BGP-LS-SPF NLRI for BGP-LS peers. For example, a controller
may delay advertisement of a link between two peers the until EoR
marker <xref target="BGP-LS-SPF-EOR"/> has been
received from both BGP peers and the BGP-LS Link NLRI for the link(s) between the two nodes
have been received from both BGP peers.
</t>
</section>
</section>
<section anchor="protocol-extend" numbered="true" toc="default">
<name>BGP Shortest Path Routing (SPF) Protocol Extensions</name>
<section anchor="SAFI" numbered="true" toc="default">
<name>BGP-LS Shortest Path Routing (SPF) SAFI</name>
<t>
This document introduces the BGP-LS-SPF SAFI with a value of 80.
The SPF-based decision process (Section 6) applies only to the
BGP-LS-SPF SAFI and MUST NOT be used with other combinations of
the BGP-LS AFI (16388). In order for two BGP speakers to
exchange BGP-LS-SPF NLRI, they MUST exchange the Multiprotocol
Extensions Capability <xref target="RFC4760" format="default"/>
to ensure that they are both capable of properly processing such
NLRI. This is done with AFI 16388 / SAFI 80. The BGP-LS-SPF SAFI
is used to advertise IPv4 and IPv6 prefix information in a
format facilitating an SPF-based decision process.
</t>
<section anchor="BGP-LS-TLV" numbered="true" toc="default">
<name>BGP-LS-SPF NLRI TLVs</name>
<t>
All the TLVs defined for BGP-LS <xref target="RFC9552" format="default"/>
are applicable and can be used with the BGP-LS-SPF SAFI to describe links, nodes,
and prefixes comprising BGP-SPF LSDB information.
</t>
<t>
The NLRI and comprising TLVs MUST be encoded as specified in
section 5.1 <xref target="RFC9552" format="default"/>. TLVs specified as
mandatory in <xref target="RFC9552" format="default"/> are
considered mandatory for the BGP-LS-SPF SAFI as
well. If a mandatory TLV is not present, the NLRI MUST NOT be used in the
BGP SPF route calculation. All the other TLVs are considered as optional TLVs. Documents
specifying usage of optional TLV for BGP SPF MUST address backward compatibility.
</t>
</section>
<section numbered="true" toc="default">
<name>BGP-LS Attribute</name>
<t>
The BGP-LS attribute of the BGP-LS-SPF SAFI uses exactly same format of the BGP-LS AFI
<xref target="RFC9552" format="default"/>. In
other words, all the TLVs used in the BGP-LS attribute of the BGP-LS AFI are applicable
and used for the BGP-LS attribute of the BGP-LS-SPF SAFI. This attribute is an optional,
non-transitive BGP attribute that is used to carry link, node, and prefix
properties and attributes. The BGP-LS attribute is a set of TLVs.
</t>
<t>
All the TLVs defined for the BGP-LS Attribute <xref target="RFC9552" format="default"/>
are applicable and can be used with the BGP-LS-SPF SAFI to carry link, node, and prefix
properties and attributes.
</t>
<t>
The BGP-LS attribute may potentially be quite large depending on
the amount of link-state information associated with a single BGP-LS-SPF NLRI.
The BGP specification <xref target="RFC4271" format="default"/> mandates a maximum BGP
message size of 4096 octets. It is RECOMMENDED that an
implementation support <xref target="RFC8654" format="default"/> in order to accommodate a greater
amount of information within the BGP-LS Attribute. BGP speakers MUST
ensure that they limit the TLVs included in the BGP-LS Attribute to
ensure that a BGP update message for a single BGP-LS-SPF NLRI does
not cross the maximum limit for a BGP message. The determination of
the types of TLVs to be included by the BGP speaker
originating the attribute is outside the scope of this document.
If, due to the limits on the maximum size of an UPDATE message, a single
route doesn't fit into the message, the BGP speaker MUST NOT advertise the
route to its peer and MAY choose to log an error locally <xref target="RFC4271"/>.
</t>
</section>
</section>
<section anchor="NLRI-Use" numbered="true" toc="default">
<name>Extensions to BGP-LS</name>
<t>
<xref target="RFC9552" format="default"/> describes a mechanism
by which link-state and TE
information can be collected from IGPs and shared with external components
using the BGP protocol. It describes both the definition of the BGP-LS NLRI
that advertise links, nodes, and prefixes comprising IGP link-state
information and the definition of a BGP path attribute (BGP-LS
attribute) that carries link, node, and prefix properties and
attributes, such as the link and prefix metric or auxiliary
Router-IDs of nodes, etc. This document extends the usage of BGP-LS NLRI for
the purpose of BGP SPF calculation via advertisement in the BGP-LS-SPF SAFI.
</t>
<t>
The protocol identifier specified in the Protocol-ID field
<xref target="RFC9552" format="default"/>
represents the origin of the advertised NLRI. For Node NLRI and Link NLRI,
the specified Protocol-ID MUST be the direct protocol (4). Node or Link NLRI with a
Protocol-ID other than the direct protocol is considered malformed.
For Prefix NLRI, the specified Protocol-ID
MUST be the origin of the prefix. The local and remote node descriptors for all NLRI MUST
include the BGP Router-ID (TLV 516) <xref target="RFC9086"/>
and the AS Number (TLV 512)
<xref target="RFC9552" format="default"/>.
The BGP Confederation Member (TLV 517)
<xref target="RFC9086" format="default"/> is not applicable.
</t>
<section numbered="true" toc="default">
<name>Node NLRI Usage</name>
<t>
The Node NLRI MUST be advertised unconditionally by all routers in
the BGP SPF routing domain.
</t>
<section anchor="node-status-tlv" numbered="true" toc="default">
<name>BGP-LS-SPF Node NLRI Attribute SPF Status TLV</name>
<t>
A BGP-LS Attribute SPF Status TLV of the BGP-LS-SPF Node NLRI is defined to indicate the status of
the node with respect to the BGP SPF calculation. This is used to rapidly take a
node out of service (refer to <xref target="node-failure" format="default"/>)
or to indicate the node is not to be
used for transit (i.e., non-local) traffic (refer to <xref target="BGP-SPF" format="default"/>).
If the SPF Status TLV is not included with the Node NLRI, the node is considered to be up
and is available for transit traffic. A single TLV type is shared by the Node, Link, and
Prefix NLRI. The TLV type is 1184.
</t>
<artwork align="left" name="" type="" alt=""><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type (1184) | Length (1 Octet) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SPF Status |
+-+-+-+-+-+-+-+-+
SPF Status Values: 0 - Reserved
1 - Node unreachable with respect to BGP SPF
2 - Node does not support transit with respect
to BGP SPF
3-254 - Undefined
255 - Reserved
]]></artwork>
<t>
If a BGP speaker received the Node NLRI but
the SPF Status TLV is not received, then any previously received SPF status information is
considered as implicitly withdrawn and the NLRI is propagated to other BGP speakers.
A BGP speaker receiving a BGP Update containing
an SPF Status TLV in the BGP-LS attribute <xref target="RFC9552" format="default"/>
with an unknown value SHOULD be advertised to other
BGP speakers and MUST ignore the Status TLV with an unknown value in
the SPF computation.
An implementation MAY log this condition for further analysis.
If the SPF Status TLV contains a reserved value (0 or 255) the TLV is considered malformed and
is handled as described in <xref target="new-TLVs"/>.
</t>
</section>
</section>
<section anchor="Link-NLRI" numbered="true" toc="default">
<name>Link NLRI Usage</name>
<t>
The criteria for advertisement of Link NLRI are discussed in
<xref target="peering-models" format="default"/>.
</t>
<t>
Link NLRI is advertised with unique local and remote node descriptors
dependent on the IP addressing. For IPv4 links, the
link's local IPv4 (TLV 259) and remote IPv4 (TLV 260) addresses are used.
For IPv6 links, the local IPv6 (TLV 261) and remote IPv6 (TLV 262) addresses
are used (Section 5.2.2 of <xref target="RFC9552"/>). IPv6 links without
global IPv6 addresses are considered unnumbered links and are handled as
described below.
For links supporting having both IPv4 and IPv6 addresses, both sets
of descriptors MAY be included in the same Link NLRI.
</t>
<t>
For unnumbered links, the Link Local/Remote Identifiers (TLV 258)
are used. The Link Remote Identifier isn't normally exchanged in BGP
and discovering the Link Remote Identifier is beyond the scope of this
document. If the Link Remote Identifier is unknown, a Link Remote Identifier of
0 MUST be advertised. When 0 is advertised and there are parallel unnumbered links
between a pair of BGP speakers, there may be transient intervals where the
BGP speakers don't agree on which of the parallel unnumbered links are operational.
For this reason, it is RECOMMENDED that the Link Remote Identifiers be
known (e.g., discovered using alternate mechanisms or configured) in the presence
of parallel unnumbered links.
</t>
<t>
The link descriptors are
described in table 4 of <xref target="RFC9552" format="default"/>.
Additionally, the Address Family Link Descriptor TLV is defined to determine whether an
unnumbered link can be used in the IPv4 SPF, the IPv6, or both (refer to
<xref target="af-link-descriptor-tlv"/>).
</t>
<t>
For a link to be used in SPF computation for a given address family,
i.e., IPv4 or IPv6, both routers connecting the link MUST have matching addresses (i.e.,
router interface addresses must be on the same subnet for numbered interfaces and the
local/remote link identifiers (<xref target="BGP-SPF"/>) must match for unnumbered
interfaces).
</t>
<t>
The IGP metric attribute TLV (TLV 1095) MUST be advertised. If a BGP speaker
receives a Link NLRI without an IGP metric attribute TLV, then it MUST consider
the received NLRI as a malformed (refer to <xref target="error-handling"/>).
The BGP SPF metric length is 4 octets. A metric is associated with the output side of each
router interface. This metric is configurable by the system administrator. The
lower the metric, the more likely the interface is to be used to forward data traffic.
One possible default for metric would be to give each interface a metric of 1
making it effectively a hop count.
</t>
<t>
The usage of other link attribute TLVs is beyond the scope of this document.
</t>
<section anchor="af-link-descriptor-tlv" numbered="true" toc="default">
<name>BGP-LS Link NLRI Address Family Link Descriptor TLV</name>
<t>
For unnumbered links, the address family cannot be ascertained from the
endpoint link descriptors. Hence, the Address Family (AF) Link Descriptor SHOULD
be included with the Link Local/Remote Identifiers TLV for unnumbered links,
so that the link can be used in the respective address family SPF. If the
Address Family Link
Descriptor is not present for an unnumbered link, the link will not be
used in the SPF computation for either address family. If the Address Family Link
Descriptor is present for a numbered link, the link descriptor
will be ignored. If the Address Family Link Descriptor TLV contains an
undefined value (3-254), the link descriptor will be ignored.
If the Address Family Link Descriptor TLV contains a
reserved value (0 or 255) the TLV is considered malformed and
is handled as described in <xref target="new-TLVs"/>.
</t>
<t>
Note that an unnumbered link can be used for both the IPv4 and IPv6
SPF computation by advertising separate Address Family Link Descriptor
TLVs for IPv4 and IPv6.
</t>
<artwork align="left" name="" type="" alt=""><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type (1185) | Length (1 Octet) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Address Family|
+-+-+-+-+-+-+-+-+
Address Family Values: 0 - Reserved
1 - IPv4 Address Family
2 - IPv6 Address Family
3-254 - Undefined
255 - Reserved
]]></artwork>
</section>
<section anchor="link-status-tlv" numbered="true" toc="default">
<name>BGP-LS-SPF Link NLRI Attribute SPF Status TLV</name>
<t>
This BGP-LS-SPF Attribute TLV of the BGP-LS-SPF Link NLRI is defined to
indicate the status of the link with respect to the BGP SPF calculation. This is used to expedite
convergence for link failures as discussed in <xref target="failure-converge" format="default"/>. If the
SPF Status TLV is not included with the Link NLRI, the link is considered
up and available. The SPF status is acted upon with the execution of the
next SPF calculation <xref target="BGP-SPF" format="default"/>.
A single TLV type is shared by the Node, Link, and Prefix NLRI.
The TLV type is 1184.
</t>
<artwork align="left" name="" type="" alt=""><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type (1184) | Length (1 Octet) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SPF Status |
+-+-+-+-+-+-+-+-+
BGP Status Values: 0 - Reserved
1 - Link Unreachable with respect to BGP SPF
2-254 - Undefined
255 - Reserved
]]></artwork>
<t>
If a BGP speaker received the Link NLRI but
the SPF Status TLV is not received, then any previously received SPF status information is
considered as implicitly withdrawn and the NLRI is propagated to other BGP speakers.
A BGP speaker receiving a BGP Update containing
an SPF Status TLV in the BGP-LS attribute <xref target="RFC9552" format="default"/>
with an unknown value SHOULD be advertised to other
BGP speakers and MUST ignore the Status TLV with an unknown value in
the SPF computation.
An implementation MAY log this information for further analysis.
If the SPF Status TLV contains a reserved value (0 or 255) the TLV is considered malformed and
is handled as described in <xref target="new-TLVs"/>.
</t>
</section>
</section>
<section anchor="Prefix-NLRI" numbered="true" toc="default">
<name>IPv4/IPv6 Prefix NLRI Usage</name>
<t>
IPv4/IPv6 Prefix NLRI is advertised with a Local Node Descriptor and
the prefix and length. The Prefix Descriptors field includes the IP Reachability
Information TLV (TLV 265) as described in <xref target="RFC9552" format="default"/>.
The Prefix Metric TLV (TLV 1155) MUST be advertised to be considered for route calculation.
The IGP Route Tag TLV (TLV 1153) MAY be advertised. The usage of other BGP-LS
attribute TLVs is beyond the scope of this document.
</t>
<section anchor="prefix-status-tlv" numbered="true" toc="default">
<name>BGP-LS-SPF Prefix NLRI Attribute SPF Status TLV</name>
<t>
A BGP-LS Attribute SPF Status TLV of the BGP-LS-SPF Prefix NLRI is defined to indicate the status of
the prefix with respect to the BGP SPF calculation. This is used to expedite
convergence for prefix unreachability as discussed in <xref target="failure-converge" format="default"/>.
If the SPF Status TLV is not included with the Prefix NLRI, the prefix is considered
reachable.
A single TLV type is shared by the Node, Link, and Prefix NLRI.
The TLV type is 1184.
</t>
<artwork align="left" name="" type="" alt=""><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type (1184) | Length (1 Octet) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SPF Status |
+-+-+-+-+-+-+-+-+
BGP Status Values: 0 - Reserved
1 - Prefix Unreachable with respect to SPF
2-254 - Undefined
255 - Reserved
]]></artwork>
<t>
If a BGP speaker received the Prefix NLRI but
the SPF Status TLV is not received, then any previously received SPF status information is
considered as implicitly withdrawn and the NLRI is propagated to other BGP speakers.
A BGP speaker receiving a BGP Update containing
an SPF Status TLV in the BGP-LS attribute <xref target="RFC9552" format="default"/>
with an unknown value SHOULD be advertised to other
BGP speakers and MUST ignore the Status TLV with an unknown value in
the SPF computation.
An implementation MAY log this information for further analysis.
If the SPF Status TLV contains a reserved value (0 or 255) the TLV is considered malformed and
is handled as described in <xref target="new-TLVs"/>.
</t>
</section>
</section>
<section anchor="sequence-number-tlv" numbered="true" toc="default">
<name>BGP-LS Attribute Sequence Number TLV</name>
<t>
A BGP-LS Attribute Sequence Number TLV of the BGP-LS-SPF NLRI types is defined to assure the most
recent version of a given NLRI is used in the SPF computation. The Sequence Number TLV is
mandatory for BGP-LS-SPF NLRI.
The TLV type 1181 has been assigned by IANA. The BGP-LS
Attribute Sequence Number TLV contains an 8-octet sequence number. The usage of the
Sequence Number TLV is described in <xref target="Phase-1" format="default"/>.
</t>
<artwork align="left" name="" type="" alt=""><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type (1181) | Length (8 Octets) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number (High-Order 32 Bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number (Low-Order 32 Bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
<t>
Sequence Number:
The 64-bit strictly-increasing sequence number MUST be incremented for every
self-originated version of a BGP-LS-SPF NLRI. BGP speakers implementing this specification
MUST use available mechanisms to preserve the sequence number's strictly increasing property
for the deployed life of the BGP speaker (including cold restarts).
One mechanism for accomplishing this would be to use the high-order 32 bits of the
sequence number as a wrap/boot count that is incremented any time the BGP router
loses its sequence number state or the low-order 32 bits wrap.
</t>
<t>
When incrementing the sequence number for each self-originated NLRI,
the sequence number should be treated as an unsigned 64-bit
value. If the lower-order 32-bit value wraps, the higher-order 32-bit value should
be incremented and saved in non-volatile storage. If a BGP speaker completely
loses its sequence number state (e.g., the BGP speaker hardware
is replaced or experiences a cold-start), the BGP NLRI selection rules
(see <xref target="Phase-1" format="default"/>) ensure convergence, albeit not immediately.
</t>
<t>
If the Sequence Number TLV
is not received, then the corresponding NLRI is considered as malformed and
MUST be handled as 'Treat-as-withdraw'. An implementation SHOULD log an error for
further analysis.
</t>
</section>
</section>
<section anchor="BGP-LS-SPF-EOR" numbered="true" toc="default">
<name>BGP-LS-SPF End of RIB (EoR) Marker</name>
<t>
The usage of the End-of-RIB (EoR) Marker <xref target="RFC4724"/> with the BGP-LS-SPF
SAFI is somewhat different than the other BGP SAFIs. Reception of the EoR
marker MAY optionally be expected prior to advertising an LINK-NLRI for a given peer.
</t>
</section>
<section anchor="NEXT-HOP" numbered="true" toc="default">
<name>BGP Next-Hop Information</name>
<t>
The rules for setting the BGP Next-Hop in the MP_REACH_NLRI attribute <xref target="RFC4760"/>
for the BGP-LS-SPF SAFI follow the rules in section 5.5 of <xref target="RFC9552" format="default"/>.
All BGP peers that support SPF extensions will locally compute the Local-RIB Next-Hop
as a result of the SPF process. Hence, the use of the MP_REACH_NLRI Next-Hop as a tiebreaker in the
standard BGP path decision processing is not applicable.
</t>
</section>
</section>
<section anchor="bgp-decision" numbered="true" toc="default">
<name>Decision Process with SPF Algorithm</name>
<t>
The Decision Process described in <xref target="RFC4271" format="default"/> takes place in
three distinct phases. The Phase 1 decision function of the Decision Process is
responsible for calculating the degree
of preference for each route received from a BGP speaker's peer. The Phase 2 decision
function is invoked on completion of the Phase 1 decision function and is responsible
for choosing the best route out of all those available for each
distinct destination, and for installing each chosen route into the Local-RIB.
The combination of the Phase 1 and 2 decision functions is characterized as
a Path Vector algorithm.
</t>
<t>
The SPF-based Decision process replaces the BGP Decision process described in
<xref target="RFC4271" format="default"/>.
Since BGP-LS-SPF NLRI always contains the local node descriptor as described in
<xref target="NLRI-Use" format="default"/>, each NLRI is uniquely originated by a single
BGP speaker in the BGP SPF routing domain (the BGP node matching the NLRI's Node
Descriptors). Instances of the same NLRI originated by multiple BGP speakers would be
indicative of a configuration error or a masquerading attack
(refer to <xref target="Security" format="default"/>).
These selected Node NLRI and their Link/Prefix NLRI are used to build a directed
graph during the SPF computation as described below. The best routes for BGP prefixes
are installed in the RIB as a result of the SPF process.
</t>
<t>
When BGP-LS-SPF NLRI is received, all that is required is to determine
whether it is the most recent by examining the Node-ID and sequence number as described
in <xref target="Phase-1" format="default"/>. If the received NLRI has changed, it is advertised
to other BGP-LS-SPF peers. If the attributes have changed (other than the sequence number),
a BGP SPF calculation is triggered. However, a changed NLRI MAY be advertised immediately
to other peers and prior to any SPF calculation. Note that the BGP
MinASOriginationIntervalTimer <xref target="RFC4271" format="default"/> timer is not applicable
to the BGP-LS-SPF SAFI. The MinRouteAdvertisementIntervalTimer is applicable with a suggested default
of 5 seconds consistent with Internal BGP (IBGP) (refer to section 10 of <xref target="RFC4271"/>).
The scheduling of the SPF calculation, as described in
<xref target="BGP-SPF" format="default"/>, is an implementation and/or configuration matter.
Scheduling MAY be dampened consistent with the SPF back-off algorithm
specified in <xref target="RFC8405" format="default"/>.
</t>
<t>
The Phase 3 decision function
of the Decision Process <xref target="RFC4271" format="default"/> is also simplified since under
normal SPF operation, a BGP speaker MUST advertise the changed NLRIs
to all BGP peers with the BGP-LS-SPF AFI/SAFI and install the changed routes in
the GLOBAL-RIB. The only exception are unchanged
NLRIs or stale NLRIs, i.e., NLRI received with a less recent (numerically smaller)
sequence number.
</t>
<section anchor="Phase-1" numbered="true" toc="default">
<name>BGP SPF NLRI Selection</name>
<t>
For all BGP-LS-SPF NLRIs, the selection rules for phase 1 of the BGP
decision process, section 9.1.1 <xref target="RFC4271" format="default"/>, no longer apply.
</t>
<ol spacing="normal" type="1"><li>
NLRI self-originated from directly-connected BGP SPF peers are preferred.
This condition can be determined by comparing the BGP Identifiers in
the received Local Node Descriptor and the BGP OPEN message for an active
BGP session. This rule assures that stale NLRI is updated even if a BGP SPF router
loses its sequence number state due to a cold-start. Note that once the BGP session
goes down, the NLRI received is no longer considered as being from a directly
connected BGP SPF peer.
</li>
<li>
Consistent with base BGP <xref target="RFC4271"/>, NLRI received from a peer will always
replace the same NLRI received from that peer. Coupled with rule #1, this will ensure that
any stale NLRI in the BGP SPF routing domain will be updated.
</li>
<li>
The NLRI with the most recent Sequence Number TLV, i.e., highest sequence number is selected.
</li>
<li>
The NLRI received from the BGP speaker with the numerically larger BGP
Identifier is preferred.
</li>
</ol>
<t>
When a BGP speaker completely loses its sequence number state, e.g., due to a cold start, or
in the unlikely possibility that 64-bit sequence number wraps, the BGP routing domain will
still converge. This is due to the fact that BGP speakers adjacent to the router
always accept self-originated NLRI from the associated speaker as more recent (rule #1). When a
BGP speaker reestablishes a connection with its peers, any existing sessions are taken
down and stale NLRI are replaced. The adjacent BGP speakers update their NLRI
advertisements and advertise to their neighbors until the BGP routing domain has converged.
</t>
<t>
The modified SPF Decision Process performs an SPF calculation rooted at the local BGP
speaker using the metrics from the Link Attribute IGP Metric TLV (1095) and
the Prefix Attribute Prefix Metric TLV (1155) <xref target="RFC9552" format="default"/>.
These metrics are considered consistently across the BGP SPF domain.
As a result, any other BGP attributes that
would influence the BGP decision process defined in <xref target="RFC4271" format="default"/> including
ORIGIN, MULTI_EXIT_DISC, and
LOCAL_PREF attributes are ignored by the SPF algorithm. The Next Hop in the MP_REACH_NLRI attribute
<xref target="RFC4760"/> is discussed in <xref target="NEXT-HOP" format="default"/>.
The AS_PATH and AS4_PATH <xref target="RFC6793" format="default"/> attributes
are preserved and used for loop detection <xref target="RFC4271" format="default"/>. They are ignored
during the SPF computation for BGP-LS-SPF NLRIs.
</t>
<section anchor="Self-Origin" numbered="true" toc="default">
<name>BGP Self-Originated NLRI</name>
<t>
Node, Link, or Prefix NLRI with Node Descriptors matching the local BGP speaker are
considered self-originated. When self-originated NLRI is received and it doesn't match the
local node's NLRI content (including sequence number), special processing is required.
</t>
<ul spacing="normal">
<li>
If self-originated NLRI is received and the sequence number is more recent (i.e., greater than
the local node's sequence number for the NLRI), the NLRI sequence number is advanced to
one greater than the received sequence number and the NLRI is readvertised to all peers.
</li>
<li>
If self-originated NLRI is received and the sequence number is the same as the local node's
sequence number but the attributes differ, the NLRI sequence number is advanced to
one greater than the received sequence number and the NLRI is readvertised to all peers.
</li>
<!--
<li>
If self-originated Link or Prefix NLRI is received and the Link or Prefix NLRI is no longer
being advertised by the local node, the NLRI is considered stale and is withdrawn using the
standard BGP Update message Withdrawn Routes encodings <xref target="RFC4760"/>.
</li>
-->
</ul>
<t>
The above actions are performed immediately when the first instance of a newer self-originated NLRI is
received. In this case, the newer instance is considered to be a stale instance that was advertised by
the local node prior to a restart where the NLRI state was lost.
However, if subsequent newer self-originated
NLRI is received for the same Node, Link, or Prefix NLRI, the readvertisement
or withdrawal is delayed by BGP_LS_SPF_SELF_READVERTISEMENT_DELAY (default 5) seconds
since it is likely being advertised by a misconfigured or rogue BGP speaker
(refer to <xref target="Security" format="default"/>).
</t>
</section>
</section>