CSC, or Carrier Supporting Carrier, takes inter-AS L3 VPN to the next level. Let’s say that you are an ISP and you are offering L3VPN MPLS services to your customers in England. You take over another ISP located in say, Australia, and two of your UK customers are also located in Australia. They would like their offices in both locations connected over the MPLS cloud.
It would be very expensive to run a new line between those locations. You do, however, still want to provide L3VPN services to your customers. The is where CSC comes in. CSC allows another ISP to connect both sides of your ISP network together. It also ensures the the core ISP doesn’t learn about any of your customer prefixes, as it doesn’t need to.
Let’s take the following diagram into consideration. (click the image for full size)

Routers 1, 7, 14, 6, 9, and 8 are all part of ISP2. There are three routers located in each geographical location. Routers 2, 3, 15, and 5 are part of the core carrier. This network stretches to both locations. The rest are customer routers injecting their loopback addresses into OSPF to test.
The core carrier is running IS-IS and LDP. ISP100 is running OSPF and LDP. I won’t go into the regular IGP+LDP config as is pretty straightforward.
With CSC, there are a few new terms to deal with. R14 and R8 are going to be regular PE routers for our ISP. R1 and R6 are going to be called CSC-CE (Customer Supporting Carrier – Customer Edge) routers. R2 and R5 are going to be CSC-PE (Customer Supporting Carrier – Provider Edge) routers. All other ISP routers are simply core routers.
The terminology above assumes that you are speaking in regards to being the core carrier (ISP 500 in this case). That is, ISP500′s edge routers are ‘PE’ and R1 and R6 is the customer’s (ISP) PE routers (Called CE in this case)
It’s a little confusing, depending on which view you take, but it’s really not that difficult.
Initial R14/R8 config – regular PE
The first thing we can start off with is to ensure R14′s PE config is correct. I am running OSPF with the CE routers and learning routes from them:
vrf definition CUS1 rd 14.14.14.14:1 route-target export 100:1 route-target import 100:1 ! address-family ipv4 exit-address-family ! vrf definition CUS2 rd 14.14.14.14:2 route-target export 100:2 route-target import 100:2 ! address-family ipv4 exit-address-family ! interface FastEthernet1/1 vrf forwarding CUS1 ip address 10.14.16.14 255.255.255.0 ip ospf network point-to-point ip ospf 3 area 0 ! interface FastEthernet2/0 vrf forwarding CUS2 ip address 10.14.15.14 255.255.255.0 ip ospf network point-to-point ip ospf 2 area 0
R8 has a similar config on the other side so I won’t put it here.
CSC-PE config
R2 and R5 are going to act as PE routers for the core carrier. They will have their AS200 facing interfaces in a VRF. R2 and R5 will be running regular VPNv4 BGP with each other.

R2:
vrf definition CSC_AS100 rd 2.2.2.2:500 route-target export 500:100 route-target import 500:100 ! address-family ipv4 exit-address-family ! interface FastEthernet1/0 vrf forwarding CSC_AS100 ip address 10.1.2.2 255.255.255.0 ! router bgp 500 no bgp default ipv4-unicast neighbor 5.5.5.5 remote-as 500 neighbor 5.5.5.5 update-source Loopback0 ! address-family vpnv4 neighbor 5.5.5.5 activate neighbor 5.5.5.5 send-community extended exit-address-family
Once again, R5 has a similar config on the other side so I’m not putting it here.
CSC-CE
Eventually we need R14 and R8 to peer with each other via VPNv4.
This is much like option C in the inter-AS config in which two PE routers peer with each other even though they are not directly connected in the same AS. In order to do so we need each of them to have a valid route to each other. As R1 and R6 each have routes to their local PE device, they need to learn routes from the other side through the core carrier. To do this I’ll be running BGP to the core carrier. I also need to ensure that I’m sending and receiving labeled BGP routes and the end LSP has to be end to end. No part of the path can be unlabelled. I need to advertise the PE’s (R14 and R8) loopback are advertised over to the core carrier:
interface FastEthernet1/0 ip address 10.1.2.1 255.255.255.0 mpls bgp forwarding ! router bgp 100 bgp log-neighbor-changes neighbor 10.1.2.2 remote-as 500 ! address-family ipv4 network 14.14.14.14 mask 255.255.255.255 neighbor 10.1.2.2 activate neighbor 10.1.2.2 allowas-in 1 neighbor 10.1.2.2 send-label no auto-summary exit-address-family
I’ll need to allowas-in 1 as both sides are running the same AS number. Without it, the CSC-CE routers would reject the BGP update.
I then need to ensure the CSC-CE routers are redistributing those learned prefixes into the IGP:
router ospf 1 redistribute bgp 100 subnets
R6 again has a similar config.
The end result of it so far is that R8 and R14 should now be able to ping each other from their respective loopbacks:
R8#ping 14.14.14.14 so lo0 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 14.14.14.14, timeout is 2 seconds: Packet sent with a source address of 8.8.8.8 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 32/45/56 ms R8#traceroute 14.14.14.14 so lo0 Type escape sequence to abort. Tracing the route to 14.14.14.14 1 10.8.9.9 [MPLS: Label 24 Exp 0] 56 msec 44 msec 32 msec 2 10.6.9.6 [MPLS: Label 24 Exp 0] 60 msec 24 msec 60 msec 3 10.5.6.5 [MPLS: Label 16 Exp 0] 44 msec 44 msec 48 msec 4 10.5.13.13 [MPLS: Labels 20/34 Exp 0] 48 msec 44 msec 44 msec 5 10.3.13.3 [MPLS: Labels 19/34 Exp 0] 44 msec 44 msec 40 msec 6 10.1.2.2 [MPLS: Label 34 Exp 0] 40 msec 44 msec 44 msec 7 10.1.2.1 [MPLS: Label 19 Exp 0] 44 msec 44 msec 28 msec 8 10.1.7.7 [MPLS: Label 18 Exp 0] 56 msec 32 msec 12 msec 9 10.7.14.14 52 msec * 72 msec
Which they can.
PE config – continued
Now that the PE routers have connectivity to each other, we can set up the VPNv4 BGP session:
router bgp 100 no bgp default ipv4-unicast neighbor 8.8.8.8 remote-as 100 neighbor 8.8.8.8 update-source Loopback0 ! address-family vpnv4 neighbor 8.8.8.8 activate neighbor 8.8.8.8 send-community extended exit-address-family
Is the session up?
R14#show bgp vpnv4 unicast all summary BGP router identifier 14.14.14.14, local AS number 100 BGP table version is 1, main routing table version 1 Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 8.8.8.8 4 100 4 4 1 0 0 00:00:50 0
Yes, but no prefixes learnt. We still need to redistribute between our vrf aware OSPF processes and BGP:
router ospf 3 vrf CUS1 redistribute bgp 100 subnets ! router ospf 2 vrf CUS2 redistribute bgp 100 subnets ! router bgp 100 ! address-family ipv4 vrf CUS1 redistribute ospf 3 vrf CUS1 exit-address-family ! address-family ipv4 vrf CUS2 redistribute ospf 2 vrf CUS2 exit-address-family
Verification
As always with these types of configs, we need to ensure both the control and data planes are working correctly. First let’s see the control plane update of R16′s loopback over to R11. The PE router of R14 should be learning this as a vrf prefix:
R14#show ip route vrf CUS1 16.16.16.16
Routing Table: CUS1
Routing entry for 16.16.16.16/32
Known via "ospf 3", distance 110, metric 2, type intra area
Redistributing via bgp 100
Advertised by bgp 100
Last update from 10.14.16.16 on FastEthernet1/1, 00:18:35 ago
Routing Descriptor Blocks:
* 10.14.16.16, from 16.16.16.16, 00:18:35 ago, via FastEthernet1/1
Route metric is 2, traffic share count is 1
This prefix is converted into a VPNv4 prefix and advertised over to R8:
R8#show bgp vpnv4 un rd 14.14.14.14:1 16.16.16.16
BGP routing table entry for 14.14.14.14:1:16.16.16.16/32, version 3
Paths: (1 available, best #1, no table)
Not advertised to any peer
Local
14.14.14.14 (metric 1) from 14.14.14.14 (14.14.14.14)
Origin incomplete, metric 2, localpref 100, valid, internal, best
Extended Community: RT:100:1 OSPF DOMAIN ID:0x0005:0x000000030200
OSPF RT:0.0.0.0:2:0 OSPF ROUTER ID:10.14.16.14:0
mpls labels in/out nolabel/27
This should end up in the correct vrf table:
R8#show ip route vrf CUS1 16.16.16.16
Routing Table: CUS1
Routing entry for 16.16.16.16/32
Known via "bgp 100", distance 200, metric 2, type internal
Redistributing via ospf 2
Advertised by ospf 2 subnets
Last update from 14.14.14.14 00:07:03 ago
Routing Descriptor Blocks:
* 14.14.14.14 (default), from 14.14.14.14, 00:07:03 ago
Route metric is 2, traffic share count is 1
AS Hops 0
MPLS label: 27
MPLS Flags: MPLS Required
Finally R11 should receive that as on OSPF route:
R11#show ip route 16.16.16.16
Routing entry for 16.16.16.16/32
Known via "ospf 1", distance 110, metric 2
Tag Complete, Path Length == 1, AS 100, , type extern 2, forward metric 1
Last update from 10.8.11.8 on FastEthernet1/0, 00:06:38 ago
Routing Descriptor Blocks:
* 10.8.11.8, from 10.8.11.8, 00:06:38 ago, via FastEthernet1/0
Route metric is 2, traffic share count is 1
Route tag 3489661028
So our control plane is all good so far. Let’s check our data plane forwarding:
R11#ping 16.16.16.16 so lo0 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 16.16.16.16, timeout is 2 seconds: Packet sent with a source address of 11.11.11.11 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 44/57/76 ms R11#traceroute 16.16.16.16 so lo0 Type escape sequence to abort. Tracing the route to 16.16.16.16 1 10.8.11.8 8 msec 12 msec 12 msec 2 10.8.9.9 [MPLS: Labels 24/27 Exp 0] 52 msec 56 msec 48 msec 3 10.6.9.6 [MPLS: Labels 24/27 Exp 0] 56 msec 48 msec 48 msec 4 10.5.6.5 [MPLS: Labels 16/27 Exp 0] 56 msec 40 msec 68 msec 5 10.5.13.13 [MPLS: Labels 20/34/27 Exp 0] 52 msec 44 msec 52 msec 6 10.3.13.3 [MPLS: Labels 19/34/27 Exp 0] 56 msec 48 msec 40 msec 7 10.2.3.2 [MPLS: Labels 34/27 Exp 0] 52 msec 52 msec 40 msec 8 10.1.2.1 [MPLS: Labels 19/27 Exp 0] 44 msec 44 msec 48 msec 9 10.1.7.7 [MPLS: Labels 18/27 Exp 0] 52 msec 40 msec 60 msec 10 10.14.16.14 [MPLS: Label 27 Exp 0] 40 msec 44 msec 36 msec 11 10.14.16.16 44 msec * 52 msec
No problems there :)
Carrier Supporing Carrier Supporting Carrier
So let’s be silly and go a step further. What if our final customer is actually another ISP offering L3VPN to it’s customers? Let’s change our topology slightly(click the image for full size)

I’m not going to show all the config here as it’s simply too much to fit into a blog post. However the config itself is pretty much like so. You’ll need to click the image for the larger version:

R10 and R15 are our final CE routers. Once all is configured does it all actually work?
R10#traceroute 15.15.15.15 so lo0 Type escape sequence to abort. Tracing the route to 15.15.15.15 1 10.10.11.11 12 msec 12 msec 8 msec 2 10.8.11.8 [MPLS: Labels 16/17 Exp 0] 72 msec 48 msec 68 msec 3 10.8.9.9 [MPLS: Labels 22/24/17 Exp 0] 36 msec 80 msec 44 msec 4 10.6.9.6 [MPLS: Labels 20/24/17 Exp 0] 52 msec 56 msec 52 msec 5 10.5.6.5 [MPLS: Labels 23/24/17 Exp 0] 64 msec 52 msec 52 msec 6 10.5.13.13 [MPLS: Labels 20/23/24/17 Exp 0] 48 msec 60 msec 52 msec 7 10.3.13.3 [MPLS: Labels 19/23/24/17 Exp 0] 56 msec 56 msec 48 msec 8 10.2.3.2 [MPLS: Labels 23/24/17 Exp 0] 56 msec 44 msec 56 msec 9 10.1.2.1 [MPLS: Labels 19/24/17 Exp 0] 48 msec 48 msec 64 msec 10 10.1.7.7 [MPLS: Labels 16/24/17 Exp 0] 56 msec 52 msec 60 msec 11 10.7.14.14 [MPLS: Labels 24/17 Exp 0] 44 msec 56 msec 56 msec 12 10.15.16.16 [MPLS: Label 17 Exp 0] 56 msec 56 msec 44 msec 13 10.15.16.15 56 msec * 76 msec
It does indeed. At this point we are up to a four label stack in AS500. If we were running RSVP-TE and FRR we would have even more labels sitting on top.
There is a much easier way to do this of course. The original Customer Carrier could just buy some some or virtual leased line or VPLS from AS500 and they would be directly connected over the same subnet. They could then run MPLS over that link and as far as anyone cares R1 and R6 would be directly connected to each other.
But of course this is a topic on the CCIE SP after all…
I wanted to test inter-vendor MPLS L3VPN compatibility between Brocade, Cisco, and Juniper. The ‘core’ itself will be Junos. In a future post I’ll probably have a random Brocade/Cisco device in the core as well to show how that works. This post will be the basis for a number of future posts on various MPLS applications. I wanted to have the core itself all done so that’s what I’ll crack on with on this post.
I’ll be running RSVP TE tunnels between my PE routers. The core devices will also be running RSVP of course. For this lab I’m just using OSPF as my core IGP.
Let’s use the following topology:
R4 is a Cisco 7200 running advanced IP services version 12.2(33)SRD4
R8 is a Brocade Netiron XMR running 5.4b
All the other routers are M10s running 10.4R12.4
R6, R7, and R5 are my CPE routers – Note that they will not be used for this post, only future posts. R3, R4, and R8 are the PE routers. R1 and R2 are the P routers.
Core
As always with MPLS, the P routers config is very minimal. All the core interfaces are configured like so:
interfaces {
fe-0/0/3 {
unit 12 {
vlan-id 12;
family inet {
address 10.0.4.6/30;
}
family mpls;
Family MPLS has to be configured on all core interfaces. My protocols config on R2 is like so:
USER2:R2> show configuration protocols
rsvp {
interface fe-1/0/2.0;
interface fe-0/0/3.12;
interface fe-0/0/3.24;
}
mpls {
interface fe-1/0/2.0;
interface fe-0/0/3.24;
interface fe-0/0/3.12;
}
ospf {
traffic-engineering;
area 0.0.0.0 {
interface all;
}
}
R1 has got a very similar config so I’m not pasting it here.
Junos PE
Family MPLS needs to be configured on the core facing interfaces. The rest of the relevant config for my set up is as follows:
USER3:R3> show configuration protocols
rsvp {
interface fe-1/0/3.13;
}
mpls {
no-cspf;
label-switched-path TO-R4 {
to 4.4.4.4;
primary TO-R4;
}
label-switched-path TO-R8 {
to 8.8.8.8;
primary TO-R8;
}
path TO-R4 {
4.4.4.4 loose;
}
path TO-R8 {
8.8.8.8 loose;
}
interface fe-1/0/3.13;
}
ospf {
traffic-engineering;
area 0.0.0.0 {
interface all;
interface fe-0/0/3.36 {
disable;
}
}
}
I’ve enabled loose paths to the loopbacks of the other 2 PE routers. OSPF TE is turned on and RSVP and MPLS are switched onto the relevant interfaces.
Brocade Netiron PE
There is no need to configure anything specific on the actual MPLS interfaces for Brocade. You simply need to add the core facing interfaces to the MPLS configuration stanza.
router ospf area 0 ! router mpls policy traffic-eng ospf area 0 path TO-R3 loose 3.3.3.3 path TO-R4 loose 4.4.4.4 mpls-interface ve2 lsp TO-R3 to 3.3.3.3 primary TO-R3 enable lsp TO-R4 to 4.4.4.4 primary TO-R4 enable
Cisco IOS PE
With IOS, you need to ensure CEF is enabled. You also need to turn mpls traffic-engineering tunnels on globally, as well as on each core facing interface:
ip cef ! mpls traffic-eng tunnels ! interface Tunnel0 ip unnumbered Loopback0 tunnel destination 3.3.3.3 tunnel mode mpls traffic-eng tunnel mpls traffic-eng path-option 10 dynamic ! interface Tunnel1 ip unnumbered Loopback0 tunnel destination 8.8.8.8 tunnel mode mpls traffic-eng tunnel mpls traffic-eng path-option 10 dynamic ! router ospf 1 router-id 4.4.4.4 log-adjacency-changes mpls traffic-eng router-id Loopback0 mpls traffic-eng area 0
Verification
The output of various show commands are of course different on each platform. I’ll be showing the actual LSP as well as do some MPLS traceroutes to show how each gives up information. The ‘detail’ switch on the LSPs throws out tons of information so I’ll add those on a later post. The main thing we are concerned about now is just to ensure the LSPs are in fact ‘up’
Brocade:
SSH@XMR_R8#sh mpls lsp
Note: LSPs marked with * are taking a Secondary Path
Admin Oper Tunnel Up/Dn Retry Active
Name To State State Intf Times No. Path
TO-R3 3.3.3.3 UP UP tnl0 1 0 TO-R3
TO-R4 4.4.4.4 UP UP tnl1 2 0 TO-R4
Junos:
USER3:R3> show mpls lsp Ingress LSP: 2 sessions To From State Rt P ActivePath LSPname 4.4.4.4 3.3.3.3 Up 0 * TO-R4 TO-R4 8.8.8.8 3.3.3.3 Up 0 * TO-R8 TO-R8 Total 2 displayed, Up 2, Down 0 Egress LSP: 2 sessions To From State Rt Style Labelin Labelout LSPname 3.3.3.3 4.4.4.4 Up 0 1 SE 3 - C7200_12.2SRD_t0 3.3.3.3 8.8.8.8 Up 0 1 FF 3 - TO-R3 Total 2 displayed, Up 2, Down 0 Transit LSP: 0 sessions Total 0 displayed, Up 0, Down 0
Cisco:
C7200_12.2SRD#sh mpls traffic-eng tunnels brief
Signalling Summary:
LSP Tunnels Process: running
Passive LSP Listener: running
RSVP Process: running
Forwarding: enabled
Periodic reoptimization: every 3600 seconds, next in 3315 seconds
Periodic FRR Promotion: Not Running
Periodic auto-bw collection: every 300 seconds, next in 15 seconds
TUNNEL NAME DESTINATION UP IF DOWN IF STATE/PROT
C7200_12.2SRD_t0 3.3.3.3 - Fa0/0.24 up/up
C7200_12.2SRD_t1 8.8.8.8 - Fa0/0.24 up/up
TO-R4 4.4.4.4 Fa0/0.24 - up/up
TO-R4 4.4.4.4 Fa0/0.24 - up/up
Both Cisco and Juniper show both outbound and inbound tunnels. The Brocade only shows outgoing tunnels in the brief output. The P routers will show transit tunnels like so:
USER2:R2> show mpls lsp Ingress LSP: 0 sessions Total 0 displayed, Up 0, Down 0 Egress LSP: 0 sessions Total 0 displayed, Up 0, Down 0 Transit LSP: 6 sessions To From State Rt Style Labelin Labelout LSPname 3.3.3.3 4.4.4.4 Up 0 1 SE 300016 299872 C7200_12.2SRD_t0 3.3.3.3 8.8.8.8 Up 0 1 FF 299824 299792 TO-R3 4.4.4.4 8.8.8.8 Up 0 1 FF 300048 0 TO-R4 4.4.4.4 3.3.3.3 Up 0 1 FF 300032 0 TO-R4 8.8.8.8 4.4.4.4 Up 0 1 SE 300000 3 C7200_12.2SRD_t1 8.8.8.8 3.3.3.3 Up 0 1 FF 299856 3 TO-R8 Total 6 displayed, Up 6, Down 0
It’s pretty clear from the output above that LSPs are unidirectional and hence each PE-PE link is actually 2 LSPs, 1 in either direction.
You can also use MPLS pings and traceroutes. This is the output of a couple of MPLS RSVP traceroutes:
Brocade:
SSH@XMR_R8#traceroute mpls rsvp lsp TO-R3 Trace RSVP LSP TO-R3, timeout 5000 msec, TTL 1 to 30 Type Control-c to abort 1 1ms 2.2.2.2 return code 8(Transit) 2 1ms 1.1.1.1 return code 8(Transit) 3 1ms 3.3.3.3 return code 3(Egress)
Cisco:
C7200_12.2SRD#traceroute mpls traffic-eng tunnel 0 Tracing MPLS TE Label Switched Path on Tunnel0, timeout is 2 seconds Codes: '!' - success, 'Q' - request not sent, '.' - timeout, 'L' - labeled output interface, 'B' - unlabeled output interface, 'D' - DS Map mismatch, 'F' - no FEC mapping, 'f' - FEC mismatch, 'M' - malformed request, 'm' - unsupported tlvs, 'N' - no label entry, 'P' - no rx intf label prot, 'p' - premature termination of LSP, 'R' - transit router, 'I' - unknown upstream index, 'X' - unknown return code, 'x' - return code 0 Type escape sequence to abort. 0 10.0.4.9 MRU 1500 [Labels: 300016 Exp: 0] L 1 2.2.2.2 MRU 1518 [Labels: 299872 Exp: 7] 36 ms L 2 1.1.1.1 MRU 1518 [Labels: implicit-null Exp: 0] 24 ms ! 3 3.3.3.3 1 ms
Juniper:
USER3:R3> traceroute mpls rsvp TO-R8
Probe options: retries 3, exp 7
ttl Label Protocol Address Previous Hop Probe Status
1 299824 RSVP-TE 10.0.4.14 (null) Success
2 299856 RSVP-TE 10.0.4.6 10.0.4.14 Success
3 3 RSVP-TE 192.168.1.2 10.0.4.6 Egress
Path 1 via fe-1/0/3.13 destination 127.0.0.64
All slightly different info, but the same end result. All of the vendors have detail switches which shows a lot more information.
So there you have it. I have all the RSVP tunnels up between all my PE routers and all is well. As noted before, this post will form the basis of a series of posts of various MPLS applications.
I’m going to have to split this topic into three separate posts because otherwise it’ll just be too long and I’ll lose you halfway through.
Part 1 – Cisco IOS
Part 2 – Juniper JunOS
Part 3 – Brocade Netiron XMR/MLX
Part 4 – Cisco IOS-XR
Most people I speak to who have MPLS experience is usually experienced with LDP. Most probably because it’s easy and they have no need for traffic engineering.
However in the ISP space, the vast majority of MPLS cores run RSVP-TE. Not only does it give you traffic-engineering capabilities, it also gives you features like fast-reroute and hot standby LSPs. You can also use your IGP to carry TE extensions, but only link-state protocols will do this for you. i.e. you can forget about EIGRP doing anything good for you in an ISP core.
Some people tend to think that RSVP-TE is difficult, but really it’s not that difficult at all. Once you get over the initial hurdles you’ll see how powerful it can be. I have extensive Brocade Netiron RSVP-TE experience, a fair amount of JunOS RSVP-TE experience and hardly any IOS RSVP-TE experience. This is because my current core is all Brocade and Juniper. Unfortunately I can only test RSVP-TE on IOS and not IOS-XR as I don’t have any IOS-XR boxes available for me to test on. It’s far more likely that an ISP core would be running IOS-XR over IOS.
Let’s take the following topology into consideration that I’ll be using for all vendor makes. AR1 and AR3 are my ‘edge’ routers running iBGP with each other. They are each advertising a second loopback address to each over over BGP. CR1, CR2, and CR3 are my core routers not running any BGP at all.
IOS basic config
Let’s start with the core network first. I’m pasting the relevant pieces of config here of CR1. CR2 and CR3 are going to be very similar:
mpls traffic-eng tunnels ! interface Loopback0 ip address 3.3.3.3 255.255.255.255 ip ospf 1 area 0 ! interface Serial1/0 ip address 10.2.0.2 255.255.255.0 ip ospf 1 area 0 mpls traffic-eng tunnels ! interface Serial1/2 ip address 10.3.0.1 255.255.255.0 ip ospf 1 area 0 mpls traffic-eng tunnels ! router ospf 1 mpls traffic-eng router-id Loopback0 mpls traffic-eng area 0
AR1:
mpls traffic-eng tunnels ! interface Loopback0 ip address 2.2.2.2 255.255.255.255 ip ospf 1 area 0 ! interface Loopback20 ip address 20.20.20.20 255.255.255.255 ! interface Tunnel0 ip unnumbered Loopback0 tunnel destination 4.4.4.4 tunnel mode mpls traffic-eng tunnel mpls traffic-eng path-option 5 dynamic no routing dynamic ! interface Serial1/0 ip address 10.2.0.1 255.255.255.0 ip ospf 1 area 0 mpls traffic-eng tunnels ! router ospf 1 mpls traffic-eng router-id Loopback0 mpls traffic-eng area 0 router-id 2.2.2.2 ! router bgp 13 network 20.20.20.20 mask 255.255.255.255 neighbor 4.4.4.4 remote-as 13 neighbor 4.4.4.4 update-source Loopback0 no auto-summary
AR3 has a similar config to AR1, so I’m not going to list it here. Essentially what we’ve done is enabled mpls traffic-engineering globally, enabled it on the transit interfaces, and finally enabled OSPF-TE in OSPF. The AR routers have an iBGP connection to each other. There is no need to enable MPLS IP anywhere as that actually enables LDP.
Now that my tunnels are up, let’s try and ping a BGP learned route and see what happens:
AR3#ping 20.20.20.20 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 20.20.20.20, timeout is 2 seconds: U.U.U Success rate is 0 percent (0/5)
This won’t work because IOS won’t actually use this tunnel for any routing unless I specifically allow it. I could do static routing or PBR, but why not just let the routing protocol do the work?
interface Tunnel0 tunnel mpls traffic-eng autoroute announce
This command allows the IGP to use the tunnel in it’s tree calculation. Let’s take a look at whether it works now or not:
AR3#ping 20.20.20.20 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 20.20.20.20, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 20/37/44 ms
Let’s take a look at the route and CEF table:
AR3#sh ip route 2.2.2.2
Routing entry for 2.2.2.2/32
Known via "ospf 1", distance 110, metric 129, type intra area
Routing Descriptor Blocks:
* directly connected, via Tunnel0
Route metric is 129, traffic share count is 1
AR3#sh ip cef 20.20.20.20
20.20.20.20/32, version 12, epoch 0
0 packets, 0 bytes
tag information from 2.2.2.2/32, shared
local tag: tunnel-head
fast tag rewrite with Tu0, point2point, tags imposed: {16}
via 2.2.2.2, 0 dependencies, recursive
next hop 2.2.2.2, Tunnel0 via 2.2.2.2/32
valid adjacency
tag rewrite with Tu0, point2point, tags imposed: {16}
In order to get to 2.2.2.2 which is the next-hop, it will send the traffic through the LSP tunnel. If we check the CEF table we can see that traffic will be directed towards the tunnel and have the label value of 16 imposed onto it. We can ensure this is correct with a traceroute:
AR3#traceroute 20.20.20.20 Type escape sequence to abort. Tracing the route to 20.20.20.20 1 10.3.0.1 [MPLS: Label 16 Exp 0] 36 msec 28 msec 44 msec 2 10.2.0.1 44 msec 44 msec *
It’s exactly what we see. Also note that the tunnel is actually following the shortest IGP path at the moment. This is because in the above config we told the ARs to signal the path dynamically. This means it’ll follow the IGP best path. Which will lead us onto our next section.
IOS explicit paths
We can tell IOS that we actually want to use the CR2-CR3 path instead of just learning this information dynamically. We now want to use CR2 and CR3 in the path and not CR1. We can do this in two ways depending on the topology. Either I tell my ingress router that it should follow a very specific path, or I just tell the ingress router to specifically miss a particular node. As LSPs are unidirectional, let’s try both.
AR1:
ip explicit-path name through-CR2-CR3 enable next-address 10.5.0.2 next-address 10.6.0.2 next-address 10.7.0.2 ! interface Tunnel0 tunnel mpls traffic-eng path-option 4 explicit name through-CR2-CR3 tunnel mpls traffic-eng path-option 5 dynamic
AR3:
ip explicit-path name not-through-CR1 enable exclude-address 10.3.0.1 ! interface Tunnel0 tunnel mpls traffic-eng path-option 4 explicit name not-through-CR1 tunnel mpls traffic-eng path-option 5 dynamic
AR1#traceroute 40.40.40.40 Type escape sequence to abort. Tracing the route to 40.40.40.40 1 10.5.0.2 [MPLS: Label 17 Exp 0] 60 msec 64 msec 72 msec 2 10.6.0.2 [MPLS: Label 17 Exp 0] 64 msec 60 msec 48 msec 3 10.7.0.2 68 msec 56 msec * AR3#traceroute 20.20.20.20 Type escape sequence to abort. Tracing the route to 20.20.20.20 1 10.7.0.1 [MPLS: Label 16 Exp 0] 48 msec 64 msec 64 msec 2 10.6.0.1 [MPLS: Label 16 Exp 0] 76 msec 48 msec 72 msec 3 10.5.0.1 64 msec * 60 msec
This is a pretty small topology, so by telling AR3 to skip CR1, there is only 1 other path available. So we create the explicit paths on each ingress router, and then under the tunnel interface we specify that this explicit path is more preferred than the dynamic path. Either way works and you can see from the traceroutes above that both work. The dynamic path is still left under the tunnel interface as we would still like to use it if the CR2-CR3 path becomes unavailable.
IOS Type-10 OSPF LSA
MPLS-TE extensions are carried within OSPF type-10 opaque LSAs. These LSAs have area flooding scope and hence they do not pass through multi-area OSPF. Another reason why ISP cores don’t run multi-area OSPF. You can see the LSAs in the database:
AR1#sh ip ospf database | begin Type-10 Type-10 Opaque Link Area Link States (Area 0) Link ID ADV Router Age Seq# Checksum Opaque ID 1.0.0.0 2.2.2.2 887 0x80000002 0x005AC6 0 1.0.0.0 3.3.3.3 173 0x80000003 0x005CBB 0 1.0.0.0 4.4.4.4 557 0x80000002 0x0062AE 0 1.0.0.0 22.22.22.22 385 0x80000002 0x00AAD5 0 1.0.0.0 33.33.33.33 319 0x80000002 0x00D651 0 1.0.0.2 2.2.2.2 172 0x80000004 0x004EFC 2 1.0.0.2 3.3.3.3 173 0x80000003 0x00704A 2 1.0.0.2 4.4.4.4 174 0x80000004 0x004AF6 2 1.0.0.2 22.22.22.22 128 0x80000002 0x008CDD 2 1.0.0.2 33.33.33.33 76 0x80000002 0x00EFFB 2 1.0.0.3 2.2.2.2 111 0x80000002 0x0025D4 3 1.0.0.3 3.3.3.3 173 0x80000003 0x00535C 3 1.0.0.3 4.4.4.4 306 0x80000002 0x001918 3 1.0.0.3 22.22.22.22 128 0x80000002 0x00C228 3 1.0.0.3 33.33.33.33 319 0x80000002 0x0064CC 3
If we dig deeper into the LSA originated by CR1 we can see the following:
AR1#sh ip ospf database opaque-area adv-router 3.3.3.3
OSPF Router with ID (2.2.2.2) (Process ID 1)
Type-10 Opaque Link Area Link States (Area 0)
LS age: 310
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 1.0.0.0
Opaque Type: 1
Opaque ID: 0
Advertising Router: 3.3.3.3
LS Seq Number: 80000003
Checksum: 0x5CBB
Length: 28
Fragment number : 0
MPLS TE router ID : 3.3.3.3
Number of Links : 0
LS age: 310
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 1.0.0.2
Opaque Type: 1
Opaque ID: 2
Advertising Router: 3.3.3.3
LS Seq Number: 80000003
Checksum: 0x704A
Length: 132
Fragment number : 2
Link connected to Point-to-Point network
Link ID : 2.2.2.2
Interface Address : 10.2.0.2
Neighbor Address : 10.2.0.1
Admin Metric : 64
Maximum bandwidth : 193000
Maximum reservable bandwidth : 0
Number of Priority : 8
Priority 0 : 0 Priority 1 : 0
Priority 2 : 0 Priority 3 : 0
Priority 4 : 0 Priority 5 : 0
Priority 6 : 0 Priority 7 : 0
Affinity Bit : 0x1
IGP Metric : 64
Number of Links : 1
LS age: 310
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 1.0.0.3
Opaque Type: 1
Opaque ID: 3
Advertising Router: 3.3.3.3
LS Seq Number: 80000003
Checksum: 0x535C
Length: 132
Fragment number : 3
Link connected to Point-to-Point network
Link ID : 4.4.4.4
Interface Address : 10.3.0.1
Neighbor Address : 10.3.0.2
Admin Metric : 64
Maximum bandwidth : 193000
Maximum reservable bandwidth : 0
Number of Priority : 8
Priority 0 : 0 Priority 1 : 0
Priority 2 : 0 Priority 3 : 0
Priority 4 : 0 Priority 5 : 0
Priority 6 : 0 Priority 7 : 0
Affinity Bit : 0x1
IGP Metric : 64
Number of Links : 1
You can see that it will show all your links, any affinities set, max reserved bandwidths, and any currently used bandwidths for different priorities.
I could go on for many hours showing various MPLS features but then I’ll never finish this article.
In the next part I’ll be doing JunOS showing the same features and config as I showed above. In the final part I’ll be doing the same for Brocade Netiron.
It’s quite handy to have one of these labs to test your radius configs, especially in the ISP world. This is mainly for testing radius attributes as it’s very easy to get a Cisco box to actually be a regular PPPoE server.
I have an old 7200 NPE-300 connected to a virtual machine running in VMware
I’m running Ubuntu server 12.04 so installing freeradius is pretty painless:
darreno@radius:~$ sudo apt-get install freeradius
Now we need to configure the box. Just a few files need to be edited for our environment. I won’t go over every single part of radiusd.conf, only the things I made changes to:
darreno@radius:/etc/freeradius$ sudo vi radius.conf
listen {
type = auth
ipaddr = 10.80.1.1
port = 1645
}
listen {
ipaddr = 10.80.1.1
port = 1646
type = acct
}
log {
destination = files
file = ${logdir}/radius.log
syslog_facility = daemon
stripped_names = no
auth = yes
auth_badpass = yes
auth_goodpass = yes
}
It’s always good to have a fair amount of logging, especially in a lab.
We also need to tell the FreeRadius server that a radius client will be coming in and making authentication requests. We also choose a password here:
darreno@radius:/etc/freeradius$ sudo vi clients.conf
client 10.80.1.2 {
secret = radiuspassword
shortname = 10.80.1.2
nastype = cisco
}
Short and sweet
Finally the actual username, passwords, IPs, attributes, etc are all stored in the users file. For now let’s just create a short single entry:
darreno@radius:/etc/freeradius$ sudo vi users
testuser Password = "password"
Framed-IP-Address = 192.168.1.100
Now onto the 7200. The 7200 and FreeRadius server are directly connected in this lab, but in the real world all they need is IP connectivity to each other.
aaa group server radius RADIUS_SERVER server 10.80.1.1 auth-port 1645 acct-port 1646 ! aaa authentication ppp CPE_USER group RADIUS_SERVER aaa authorization network default group RADIUS_SERVER ! vpdn enable ! bba-group pppoe LAB virtual-template 1 sessions per-mac limit 20 sessions per-vlan limit 250 ! interface Loopback0 ip address 200.200.200.200 255.255.255.255 ! interface FastEthernet0/0 description Link to FreeRadius server ip address 10.80.1.2 255.255.255.0 duplex full ! interface FastEthernet1/0 description PPPOE interface no ip address duplex full pppoe enable group LAB ! interface Virtual-Template1 ip unnumbered Loopback0 no peer default ip address ppp authentication chap CPE_USER ! radius-server host 10.80.1.1 auth-port 1645 acct-port 1646 key radiuspassword
I’ve used a radius group which allows you to add more radius servers and test fail-over scenarios.
For a test device I’ve just configured a 2801 like so:
interface FastEthernet0/0 no ip address duplex auto speed auto pppoe enable group global pppoe-client dial-pool-number 1 ! interface Dialer1 mtu 1492 ip address negotiated encapsulation ppp dialer pool 1 ppp chap hostname testuser ppp chap password 0 password
Let’s give it a quick test. I’ve enabled logging on the radius server to see what’s going on. Let me enable the 2801′s PPPoE interface and see if the radius server sees the authentication request coming in:
darreno@radius:/etc/freeradius$ tail -f /var/log/freeradius/radius.log Mon Oct 1 21:24:23 2012 : Auth: Login OK: [testuser/] (from client 10.80.1.2 port 0)
So that’s all fine. Did my router pick up the correct IP address?
c2801#sh int dialer 1
Dialer1 is up, line protocol is up (spoofing)
Hardware is Unknown
Internet address is 192.168.1.100/32
MTU 1492 bytes, BW 56 Kbit/sec, DLY 20000 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation PPP, LCP Closed, loopback not set
Keepalive set (10 sec)
DTR is pulsed for 1 seconds on reset
Interface is bound to Vi2
Last input never, output never, output hang never
Last clearing of "show interface" counters 05:13:33
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: weighted fair
Output queue: 0/1000/64/0 (size/max total/threshold/drops)
Conversations 0/0/16 (active/max active/max total)
Reserved Conversations 0/0 (allocated/max allocated)
Available Bandwidth 42 kilobits/sec
5 minute input rate 0 bits/sec, 0 packets/sec
5 minute output rate 0 bits/sec, 0 packets/sec
1017 packets input, 103010 bytes
4703 packets output, 173178 bytes
Bound to:
Virtual-Access2 is up, line protocol is up
Hardware is Virtual Access interface
MTU 1492 bytes, BW 56 Kbit/sec, DLY 20000 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation PPP, LCP Open
Stopped: CDPCP
Open: IPCP
PPPoE vaccess, cloned from Dialer1
Vaccess status 0x44, loopback not set
Keepalive set (10 sec)
Interface is bound to Di1 (Encapsulation PPP)
Last input 00:00:01, output never, output hang never
Last clearing of "show interface" counters 00:01:55
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 0 bits/sec, 0 packets/sec
5 minute output rate 0 bits/sec, 0 packets/sec
27 packets input, 387 bytes, 0 no buffer
Received 0 broadcasts (0 IP multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
26 packets output, 378 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 output buffer failures, 0 output buffers swapped out
0 carrier transitions
c2801#show ip route connected | beg Ga
Gateway of last resort is not set
192.168.1.0/32 is subnetted, 1 subnets
C 192.168.1.100 is directly connected, Dialer1
200.200.200.0/32 is subnetted, 1 subnets
C 200.200.200.200 is directly connected, Dialer1
c2801#ping 200.200.200.200
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 200.200.200.200, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/4 ms
These are PPP links and hence the 7200 and 2801 have swapped host routes. This is why they can get to each other. We can also check form the 7200 side:
c7200#sh ip route 192.168.1.100
Routing entry for 192.168.1.100/32
Known via "connected", distance 0, metric 0 (connected, via interface)
Routing Descriptor Blocks:
* directly connected, via Virtual-Access1.1
Route metric is 0, traffic share count is 1
c7200#ping 192.168.1.100
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.1.100, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/4 ms
So everything is working just as expected.
The whole point of radius attributes is to be able to do all kinds of fancy things. Let’s say that this 2801 has another network behind it that the rest of our network needs to be able to get to through the BRAS box. An easy way is to get the 7200 to install a static route to the network behind the 2801 that gets installed when the router dials in. Let’s use a loopback on the 2801 for this purpose:
interface Loopback1 ip address 40.40.40.40 255.255.255.255
going back to the users files in radius above we do the following:
testuser Password = "password"
Framed-IP-Address = 192.168.1.100,
Cisco-Avpair += "ip:route=40.40.40.40 255.255.255.255"
Let’s clear the pppoe session and take a look at the 7200:
c7200#sh ip route 40.40.40.40
Routing entry for 40.40.40.40/32
Known via "static", distance 1, metric 0
Routing Descriptor Blocks:
* 192.168.1.100
Route metric is 0, traffic share count is 1
c7200#ping 40.40.40.40
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 40.40.40.40, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/4 ms
As this is a static route to a connected route, the 7200 can redistribute the routes into the IGP so the rest of your network can get to it. Notice that when I reload the 2801 and the session is pulled down, the static route is removed:
c7200#sh ip route 40.40.40.40 % Network not in table
There are a TON of radius attributes. If I have the time I may go over a few handy ones with which you can create some powerful routing policies.
So does OSPF always use the shortest path in in order to ensure that packets always get from A to B with the lowest end to end cost? Not always. In fact when you have more than a single area it’s very easy to NOT go the shortest path at all. You could even turn your ‘non-transit’ 10Mb links into transit links.
Let’s take the following network as an example:

R3 represents our core. R1 and R2 are both aggregation boxes where all our customers connect to. These boxes are connected into the core with their Gig links. R4 is our first customer. Mr customer wants a primary Gig link with a 100Mb backup link. We have decided to put each customer into their own OSPF area. We will also be changing the auto-cost reference bandwidth to 100Gb to ensure our core sees the difference between 100Mb and Gig links:
R3 interface Loopback0 ip address 3.3.3.3 255.255.255.255 ip ospf 1 area 0 interface GigabitEthernet1/0 ip address 10.0.13.3 255.255.255.0 ip ospf 1 area 0 ! interface GigabitEthernet2/0 ip address 10.0.23.3 255.255.255.0 ip ospf 1 area 0 ! router ospf 1 router-id 3.3.3.3 auto-cost reference-bandwidth 100000
R1 interface GigabitEthernet2/0 ip address 10.0.13.1 255.255.255.0 ip ospf 1 area 4 ! interface GigabitEthernet1/0 ip address 10.0.14.1 255.255.255.0 ip ospf 1 area 0 ! router ospf 1 router-id 1.1.1.1 auto-cost reference-bandwidth 100000
R2 interface GigabitEthernet2/0 ip address 10.0.23.2 255.255.255.0 ip ospf 1 area 0 ! interface FastEthernet1/0 ip address 10.0.24.2 255.255.255.0 ip ospf 1 area 4 ! router ospf 1 router-id 2.2.2.2 auto-cost reference-bandwidth 100000
R4 interface Loopback0 ip address 4.4.4.4 255.255.255.255 ip ospf 1 area 4 ! interface GigabitEthernet1/0 ip address 10.0.14.4 255.255.255.0 ip ospf 1 area 4 ! interface FastEthernet2/0 ip address 10.0.24.4 255.255.255.0 ip ospf 1 area 4 ! router ospf 1 router-id 4.4.4.4 auto-cost reference-bandwidth 100000
Our core should now see that the best way to get to R4′s loopback is to go through R1:
3#sh ip route 4.4.4.4
Routing entry for 4.4.4.4/32
Known via "ospf 1", distance 110, metric 201, type inter area
Last update from 10.0.13.1 on GigabitEthernet1/0, 00:09:44 ago
Routing Descriptor Blocks:
* 10.0.13.1, from 1.1.1.1, 00:09:44 ago, via GigabitEthernet1/0
Route metric is 201, traffic share count is 1
R3#traceroute 4.4.4.4
Type escape sequence to abort.
Tracing the route to 4.4.4.4
VRF info: (vrf in name/id, vrf out name/id)
1 10.0.13.1 8 msec 20 msec 16 msec
2 10.0.14.4 16 msec * 20 msec
Similarly R4 should see that the best way to get to R3 is back through R1:
R4#sh ip route 3.3.3.3
Routing entry for 3.3.3.3/32
Known via "ospf 1", distance 110, metric 201, type inter area
Last update from 10.0.14.1 on GigabitEthernet1/0, 00:03:47 ago
Routing Descriptor Blocks:
* 10.0.14.1, from 1.1.1.1, 00:03:47 ago, via GigabitEthernet1/0
Route metric is 201, traffic share count is 1
R4#traceroute 3.3.3.3
Type escape sequence to abort.
Tracing the route to 3.3.3.3
VRF info: (vrf in name/id, vrf out name/id)
1 10.0.14.1 16 msec 16 msec 20 msec
2 10.0.13.3 20 msec * 20 msec
So everything is fine. Or so we think. There is already a problem here, but it won’t cause a problem until we bring in another customer. Let’s add 2 customers. The first is connected to R1 and the second is connected to R2. Both of these customers have purchased 100Mb single links.
So, traffic sent from R4′s loopback to either of the 2 new customers loopbacks should get into the core via R4′s 1Gb primary link. Is that what we see?
R4 R4#traceroute 5.5.5.5 source 4.4.4.4 Type escape sequence to abort. Tracing the route to 5.5.5.5 VRF info: (vrf in name/id, vrf out name/id) 1 10.0.14.1 16 msec 20 msec 20 msec 2 10.0.15.5 20 msec * 24 msec R4# R4#traceroute 6.6.6.6 source 4.4.4.4 Type escape sequence to abort. Tracing the route to 6.6.6.6 VRF info: (vrf in name/id, vrf out name/id) 1 10.0.14.1 12 msec 20 msec 16 msec 2 10.0.13.3 20 msec 60 msec 20 msec 3 10.0.23.2 40 msec 44 msec 40 msec 4 10.0.26.6 72 msec * 44 msec
That’s exactly what we see, but do we have the full picture here? Let’s trace from these new customers to R4′s loopback. Again both should go over R4′s 1Gb primary link:
R5 R5#traceroute 4.4.4.4 source 5.5.5.5 Type escape sequence to abort. Tracing the route to 4.4.4.4 VRF info: (vrf in name/id, vrf out name/id) 1 10.0.15.1 8 msec 20 msec 16 msec 2 10.0.14.4 20 msec * 24 msec
R5 is correct. What about R6?
R6 R6#traceroute 4.4.4.4 source 6.6.6.6 Type escape sequence to abort. Tracing the route to 4.4.4.4 VRF info: (vrf in name/id, vrf out name/id) 1 10.0.26.2 20 msec 16 msec 20 msec 2 10.0.24.4 64 msec * 68 msec
Well this is most certainly NOT correct. Why is this traceroute going through R4′s 100Mb backup link? Let’s go back to the beginning and see what we missed. Let’s have a look at the 3 core routers to see how they all want to get to 4.4.4.4:
R3
R3#sh ip route 4.4.4.4
Routing entry for 4.4.4.4/32
Known via "ospf 1", distance 110, metric 201, type inter area
Last update from 10.0.13.1 on GigabitEthernet1/0, 00:29:17 ago
Routing Descriptor Blocks:
* 10.0.13.1, from 1.1.1.1, 00:29:17 ago, via GigabitEthernet1/0
Route metric is 201, traffic share count is 1
R1
R1#sh ip route 4.4.4.4
Routing entry for 4.4.4.4/32
Known via "ospf 1", distance 110, metric 101, type intra area
Last update from 10.0.14.4 on GigabitEthernet1/0, 00:37:44 ago
Routing Descriptor Blocks:
* 10.0.14.4, from 4.4.4.4, 00:37:44 ago, via GigabitEthernet1/0
Route metric is 101, traffic share count is 1
R2
R2#sh ip route 4.4.4.4
Routing entry for 4.4.4.4/32
Known via "ospf 1", distance 110, metric 1001, type intra area
Last update from 10.0.24.4 on FastEthernet1/0, 00:38:47 ago
Routing Descriptor Blocks:
* 10.0.24.4, from 4.4.4.4, 00:38:47 ago, via FastEthernet1/0
Route metric is 1001, traffic share count is 1
Here is the problem. R2 prefers to get to 4.4.4.4 over it’s directly connected link, even though the metric through R3 would be 401, a whole lot less than 1001.
The issue is that OSPF has it’s own selection process. Regardless of metric, OSPF will ALWAYS prefer intra area routes over inter area routes over external routes. R2 has an interface in Area 4, the same area in which it’s learning about R4′s loopback. Hence when traffic addressed to 4.4.4.4 passes through it, it will always send it off over it’s area 4 interface, no matter how slow it is. It doesn’t make any difference if the second customer is in area 0 or their own area.
In fact, if you dive a bit deeper, you can see that as far as R6 is concerned, the traffic will be going over R4′s primary link. If you see the interface cost of R6′s link as well as the cost end to end this is what you get:
R6
R6#sh ip os int brief | include Fa1/0
Fa1/0 1 0 10.0.26.6/24 1000 BDR 1/1
R6#
R6#
R6#sh ip route 4.4.4.4
Routing entry for 4.4.4.4/32
Known via "ospf 1", distance 110, metric 1301, type inter area
Last update from 10.0.26.2 on FastEthernet1/0, 00:19:15 ago
Routing Descriptor Blocks:
* 10.0.26.2, from 1.1.1.1, 00:19:15 ago, via FastEthernet1/0
Route metric is 1301, traffic share count is 1
What about R2′s active route cost?
R2
R2#sh ip route 4.4.4.4
Routing entry for 4.4.4.4/32
Known via "ospf 1", distance 110, metric 1001, type intra area
Last update from 10.0.24.4 on FastEthernet2/0, 00:43:29 ago
Routing Descriptor Blocks:
* 10.0.24.4, from 4.4.4.4, 00:43:29 ago, via FastEthernet2/0
Route metric is 1001, traffic share count is 1
So R6 thinks that traffic will actually go over it’s 1000 cost link, then over the 3 X 100 cost Gig links. But R2 effectively ‘highjacks’ this traffic to send it over it’s direct area 4 link.
So, how can this be fixed?
The first way is to just put everything in area 0. This way all addresses will be reachable via inter area links in area 0. Even if you injected all prefixes in via redistribution or route-policy they’ll all be external, but still reachable through area 0 links.
The second way is to create some sort of tunnel between R1 and R2 and put that tunnel interface into area 4. This way R2 would learn about R4′s loopback over 2 area 4 interfaces. You would need to ensure this tunnel interface has a lower cost than the 100Mb direct connection to R4 in order for traffic to actually be preferred. But who really wants to be creating tunnels over the core of their network? Virtual-links can only be used to connect to area 0, not area 4. Sham links? Can only be used with MPLS.
The third way is thinking outside the box a little. You could use PPPoE over the secondary link and not use OSPF on the link. On R4 you would have a floating static route pointing towards the dialer interface. The actual radius account you use would create a static route to R4′s loopback with a next-hop of the p2p PPPoE link. Ensure the static route is created with a AD higher than OSPF to ensure it’ll use the OSPF link if available.
The fourth way is to just use another protocol connecting the core to the CPE device. BGP perhaps?
The fifth, final, and ties with option 1 for simplicity’s sake is using RFC 5185 – OSPF Multi-Area Adjacency. What this RFC states is the ability to put a routers interface into more than a single OSPF area. This means that I could keep R1 and R2′s links in area 0, but put those same links into area 4. The same would be done for R3. This means that R2 would learn the best from from R1 as an intra area route, without the need for dodgy tunnels. The main problem is that most vendors simply don’t have support for it. Cisco only has it in IOS XE. JUNOS had it since JUNOS 9.4 though. Brocade? No mention of it anywhere yet.
Considering I have some post 9.4 JunOS boxes here, let’s test this out:
R2
> show configuration protocols ospf
reference-bandwidth 10g;
area 0.0.0.0 {
interface fe-0/0/1.66;
interface fe-0/0/0.51;
}
area 0.0.0.4 {
interface fe-1/3/0.16 {
metric 1000;
}
interface fe-0/0/1.66 {
secondary;
}
}
R3
> show configuration protocols ospf
reference-bandwidth 10g;
area 0.0.0.0 {
interface fe-0/0/0.66;
interface fe-0/0/0.63;
interface lo0.9;
}
area 0.0.0.4 {
interface fe-0/0/0.66 {
secondary;
}
interface fe-0/0/0.63 {
secondary;
}
}
R1
> show configuration protocols ospf
reference-bandwidth 10g;
area 0.0.0.0 {
interface fe-0/0/1.63;
interface fe-1/3/3.79;
}
area 0.0.0.4 {
interface fe-1/3/0.14;
interface fe-0/0/1.63 {
secondary;
}
}
As you can see, the configuration is pretty simple. You simple add an interface to another area and set it as secondary. Let’s have a look at R2′s neighbours:
> show ospf neighbor Address Interface State ID Pri Dead 10.0.26.6 fe-0/0/0.51 Full 6.6.6.6 128 38 Area 0.0.0.0 10.0.23.3 fe-0/0/1.66 Full 3.3.3.3 128 38 Area 0.0.0.0 10.0.23.3 fe-0/0/1.66 Full 3.3.3.3 128 38 Area 0.0.0.4 10.0.24.4 fe-1/3/0.16 Full 4.4.4.4 128 33 Area 0.0.0.4
R2 has an adjacency over fe-0/0/1.66 twice. One in Area 0 and one in Area 4. This means it should be learning R4′s loopback as 2 intra-area and 1 inter-area route. It should then choose the path through R3 as it has the better metric:
> show route 4.4.4.4
inet.0: 17 destinations, 17 routes (17 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both
4.4.4.4/32 *[OSPF/10] 00:18:07, metric 300
> to 10.0.23.3 via fe-0/0/1.66
Which is exactly what we see.
Let’s do another traceroute from R6 to confirm:
> traceroute 4.4.4.4 traceroute to 4.4.4.4 (4.4.4.4), 30 hops max, 40 byte packets 1 10.0.26.2 (10.0.26.2) 1.098 ms 0.965 ms 0.800 ms 2 10.0.23.3 (10.0.23.3) 0.846 ms 0.943 ms 0.836 ms 3 10.0.13.1 (10.0.13.1) 0.884 ms 1.036 ms 0.882 ms 4 4.4.4.4 (4.4.4.4) 1.166 ms 1.328 ms 1.155 ms




Comments