MPLS in a Full Mesh Core with Dual Homed CE Routers
We’re making a fairly substantial change to our core network in the New Year. It’s seeing us move from a flat, L3 routed network to an MPLS fabric. This option was chosen over VRF lite in order to allow for L2 VPNs for a later convergence project. Ideally, we’d be stretching that fabric down to the distribution layer (something we definitely will be doing in the future) but current equipment limitations prevent this. The joy of still having Brocade SuperX units in live service.
Anyhow, a simplified version of our network looks a bit like this:
R1 through R4 are our core routers in this example. They’ll be making up the MPLS fabric. All of the other routers (R9 through R12) are our example distribution routers. They are connected with 2 L3 point-to-point links to two different cores. In the real world, this is many more routers connected to every possible pairing into the core mesh.
The usual way of doing this would be to have the distribution routers act as PE routers, terminating the L3 VPNs there. For now, we’ve got to do it at the core layer.
The original idea was to use OSPF to pass the distribution layer routes from the VRF instances back to the BGP core. While it works as a concept, there is a catch with it. Due to how administrative distances work, OSPF will always win against BGP. With our mesh of distribution routers split across the cores, this results in “better” routes along the distribution routers. Traffic would never properly traverse the core.
Simply adjusting the administrative distances so the MP-BGP routes win out wouldn’t help. The path to the distribution router would never be used. The traffic would simply loop round the core or be dropped.
The solution to this problem – route maps. In our case it’ll end up with some slightly odd looking results but solves our backdoor route solution.
We start by issuing every distribution router a tag number. In this small example, we’ve gone with the numbers of the routers. To apply this to the locally created routes, we add the following configuration:
router ospf 1 redistribute connected subnets tag 9
The bit of cleverness in this configuration is that we don’t make the user facing interfaces OSPF interfaces. Instead we re-distribute them in and configure the router to add the tag. That does mean our point-to-point links won’t be picked up but they’ll be collected at the core end by redistributing connected subnets there.
On the core, our configuration is a bit more complicated:
router bgp 65101 bgp log-neighbor-changes redistribute static neighbor 126.96.36.199 remote-as 65101 neighbor 188.8.131.52 update-source Loopback1 neighbor 184.108.40.206 remote-as 65101 neighbor 220.127.116.11 update-source Loopback1 neighbor 18.104.22.168 remote-as 65101 neighbor 22.214.171.124 update-source Loopback1 ! address-family vpnv4 neighbor 126.96.36.199 activate neighbor 188.8.131.52 send-community extended neighbor 184.108.40.206 activate neighbor 220.127.116.11 send-community extended neighbor 18.104.22.168 activate neighbor 22.214.171.124 send-community extended exit-address-family ! address-family ipv4 vrf Demo redistribute connected redistribute ospf 2 route-map INBOUND exit-address-family router ospf 2 vrf Demo redistribute bgp 65101 subnets distribute-list route-map INBOUND in route-map INBOUND permit 10 match tag 9 12 route-map INBOUND deny 20
Ok, that’s a lot to take in. So, let’s break it down a bit. Looking at the BGP configuration, we re-distribute the OSPF routes but only if they match our INBOUND route map. That route map looks for the tags specific to the distribution routers directly connected to said core.
The OSPF section is the most important part of this configuration. The critical bit is that while we will receive the full OSPF database but only allow routes from the “trusted” downstream routers to enter the routing table. This means that for anything not local, BGP wins out. We still maintain the ability to dynamically light up user subnets without creating the alternate path through the distribution routers.
Our routing table does look a bit odd though. Let’s look at an example distribution switch:
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP + - replicated route, % - next hop override Gateway of last resort is 10.2.104.4 to network 0.0.0.0 S* 0.0.0.0/0 [1/0] via 10.2.104.4 [1/0] via 10.2.103.3 126.96.36.199/32 is subnetted, 1 subnets O E2 188.8.131.52 [110/20] via 10.2.104.4, 00:16:53, Ethernet1/1 10.0.0.0/8 is variably subnetted, 12 subnets, 2 masks O IA 10.2.16.0/24 [110/11] via 10.2.104.4, 00:16:53, Ethernet1/1 [110/11] via 10.2.103.3, 00:16:54, Ethernet1/0 O 10.2.19.0/24 [110/21] via 10.2.103.3, 00:18:06, Ethernet1/0 O 10.2.39.0/24 [110/20] via 10.2.103.3, 00:18:06, Ethernet1/0 C 10.2.103.0/24 is directly connected, Ethernet1/0 L 10.2.103.10/32 is directly connected, Ethernet1/0 C 10.2.104.0/24 is directly connected, Ethernet1/1 L 10.2.104.10/32 is directly connected, Ethernet1/1 O 10.2.112.0/24 [110/31] via 10.2.103.3, 00:17:43, Ethernet1/0 O 10.2.114.0/24 [110/20] via 10.2.104.4, 00:17:43, Ethernet1/1 O 10.2.211.0/24 [110/30] via 10.2.104.4, 00:17:33, Ethernet1/1 O 10.2.212.0/24 [110/40] via 10.2.104.4, 00:17:33, Ethernet1/1 C 10.10.10.10/32 is directly connected, Loopback0 184.108.40.206/32 is subnetted, 1 subnets O E2 220.127.116.11 [110/20] via 10.2.103.3, 00:16:54, Ethernet1/0 18.104.22.168/32 is subnetted, 1 subnets O E2 22.214.171.124 [110/20] via 10.2.104.4, 00:16:53, Ethernet1/1 [110/20] via 10.2.103.3, 00:16:54, Ethernet1/0
Hang on, it’s still got all those round the houses paths in it! However, it is only connected to the core routers. Thus any traffic will hit the following routing table:
Routing Table: Demo Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP + - replicated route, % - next hop override Gateway of last resort is not set 126.96.36.199/32 is subnetted, 1 subnets B 188.8.131.52 [200/20] via 184.108.40.206, 00:18:14 10.0.0.0/8 is variably subnetted, 12 subnets, 2 masks B 10.2.16.0/24 [200/0] via 220.127.116.11, 00:18:14 B 10.2.19.0/24 [200/0] via 18.104.22.168, 00:18:14 B 10.2.39.0/24 [200/0] via 22.214.171.124, 00:18:14 B 10.2.103.0/24 [200/0] via 126.96.36.199, 00:18:14 C 10.2.104.0/24 is directly connected, Ethernet1/3 L 10.2.104.4/32 is directly connected, Ethernet1/3 B 10.2.112.0/24 [200/0] via 188.8.131.52, 00:18:14 C 10.2.114.0/24 is directly connected, Ethernet1/4 L 10.2.114.4/32 is directly connected, Ethernet1/4 B 10.2.211.0/24 [200/0] via 184.108.40.206, 00:18:14 B 10.2.212.0/24 [200/0] via 220.127.116.11, 00:18:14 O E2 10.10.10.10/32 [110/20] via 10.2.104.10, 00:19:27, Ethernet1/3 18.104.22.168/32 is subnetted, 1 subnets O E2 22.214.171.124 [110/20] via 10.2.114.11, 00:19:04, Ethernet1/4 126.96.36.199/32 is subnetted, 1 subnets B 188.8.131.52 [200/20] via 184.108.40.206, 00:18:14
Look at all those lovely BGP routes through our MPLS core that take over. Just to make sure, let’s do a traceroute:
Type escape sequence to abort. Tracing the route to 220.127.116.11 VRF info: (vrf in name/id, vrf out name/id) 1 10.2.103.3 196 msec 10.2.104.4 208 msec 10.2.103.3 132 msec 2 10.2.112.1 [MPLS: Label 23 Exp 0] 276 msec 404 msec 308 msec 3 10.2.112.12 488 msec 352 msec 532 msec
Our traffic from R10 to R12 passes through R3/4, then R2 and onto 12. It gets labelled ahead long the inter-core link, which is exactly what we’re aiming for.
While this configuration is a little complicated, it’s the best of what we can do for now. It’s also been an eye opener in regards to the existing infastructure. We’ve learned it’s possible for traffic to route round campus while completely missing the inter-core links. Not a likely situation in any case but one to look though.
My next challenge is to take what I’ve got in GNS3 and translate it to Brocade. That should be the easy bit…