OSPF Fast Re-Route and BFD on Junos

One of the few advantages that EIGRP had over OSPF and IS-IS was that it had feasable successors. That is the router had already pre-calculated a route to a destination over a backup, non-looping, path.

OSPF and IS-Is has had this for sometime now on both IOS and Junos. It’s also supported on IOS-XR.

This post will mainly go over OSPF. The process is nearly identical for IS-IS.

To start I’ll be using the following topology:

R3 has two links to R4. This is going through a switch which will allow us to bring the link down without pulling the interface down. I’m configuring a cost of 100 on the first link and 1000 on the second as I don’t want to bring ECMP into play for this post.

How does a router know it’s neighbour is down? If the interface goes down the detection will be quick. If the interface stays up, but something alone the path is dropping packets, the router will take quite a long time to detect this.

If we leave OSPF to its defaults, it could be 40 seconds before R3 realises it cannot get to R4 over their primary interface (Standard dead timer on broadcast links). Until that happens R3 will be sending packets into the void.

I’ll set up standard OSPF on all interfaces. From R2 I’ll be sending pings to R5’s loopback. R3 and R4 are both tagged interfaces in different vlans. On the switch I can simply remove vlan 24 which will cause packets to be dropped over that vlan.

OSPF – No tweaking

Standard OSPF here with no tweaks. I’ll be showing R3’s config here:

darreno@M7i> show configuration protocols ospf
area 0.0.0.0 {
    interface lo0.3;
    interface fe-0/1/4.24 {
        metric 100;
    }
    interface fe-0/1/5.35 {
        metric 1000;
    }
}

I’ll now initiate a ping flood from R2 to R5. Once that starts I’ll remove vlan 24 from the switch.

Let’s see how the ping flood goes:

!!!.....................................................................!!!

Not very good at all!

OSPF – BFD

Let’s add BFD to the OSPF session on both R3 and R4:

darreno@M7i> show configuration protocols ospf
area 0.0.0.0 {
    interface all;
    interface lo0.3;
    interface fe-0/1/4.24 {
        metric 100;
        bfd-liveness-detection {
            minimum-interval 50;
            minimum-receive-interval 30;
            multiplier 3;
        }
    }
    interface fe-0/1/5.35 {
        metric 1000;
        bfd-liveness-detection {
            minimum-interval 50;
            minimum-receive-interval 30;
            multiplier 3;
        }
    }
}

Do the same test as above.

!!!!.!!!

Much much better. Note that this is a very small topology though so LSAs are very quick to flood. If you had a larger topology, especially if it spans geographic regions it could take much longer for the new route to be calculated.

OSPF – BFD & FRR

Now I’ll add FRR to OSPF on R3. I’ll protect the fe-0/1/4.0 link from R3’s point of view. R3 will run SPF for all it’s destinations through that interface and will know if it can get to any destination through any other interfaces without being looped. In this simple topology any traffic sent over the higher metric interface to R4 will still get to R5 as R4 will not send it back.

First we enable link-protection:

darreno@M7i> show configuration protocols ospf area 0 interface fe-0/1/4.24
link-protection;
metric 100;
bfd-liveness-detection {
    minimum-interval 50;
    minimum-receive-interval 30;
    multiplier 3;
}

Junos will pre-calculate the routes, but it will NOT add it to the FIB by default. You have to enable more than one next-hop in the FIB:

darreno@M7i> show configuration policy-options policy-statement BALANCE
then {
    load-balance per-packet;
}

darreno@M7i> show configuration routing-options forwarding-table
export BALANCE;

Let’s run the same test as above again:

!!!!!!!!!!!!!!!!!!!!!!

I’m simply not losing any at all. The difference between BFD alone and BFD and link-protection is most pronounced on much larger topologies. Remember FRR is a router making a local repair quickly to get packets form A to B while an alternative regular route is calculated.

You can see that enabling FRR is a piece of cake. To verify you need to dig a little deeper. First let’s see the FRR coverage on R3:

darreno@M7i> show ospf backup coverage
Topology default coverage:

Node Coverage:

Area             Covered  Total  Percent
                   Nodes  Nodes  Covered
0.0.0.0                2      3   66.67%

Route Coverage:

Path Type  Covered   Total  Percent
            Routes  Routes  Covered
Intra            5      11   45.45%
Inter            0       0  100.00%
Ext1             0       0  100.00%
Ext2             0       0  100.00%
All              5      11   45.45%

Not every single prefix can be covered as it’s quite topology dependant. If we look into the detail for specifically 5.5.5.5:

darreno@M7i> show ospf backup spf detail | find 5.5.5.5
5.5.5.5
  Self to Destination Metric: 101
  Parent Node: 10.0.8.10
  Primary next-hop: fe-0/1/4.24 via 10.0.24.4
  Backup next-hop: fe-0/1/5.35 via 10.0.35.4
  Backup Neighbor: 4.4.4.4
    Neighbor to Destination Metric: 1, Neighbor to Self Metric: 1
    Self to Neighbor Metric: 100, Backup preference: 0x0
    Eligible, Reason: Contributes backup next-hop

Here we see that fe-0/1/4.24 is the primary and fe-0/1/5.35 is the backup. The backup is also eligible. If we take a look at the route itself:

darreno@M7i> show route 5.5.5.5

inet.0: 24 destinations, 25 routes (24 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[OSPF/10] 00:03:15, metric 101
                    > to 10.0.24.4 via fe-0/1/4.24
                      to 10.0.35.4 via fe-0/1/5.35

Both routes are there, but only the first will be used until it fails.

Finally we can take a look at the FIB entry:

darreno@M7i> show route forwarding-table destination 5.5.5.5
Routing table: default.inet
Internet:
Destination        Type RtRef Next hop           Type Index NhRef Netif
5.5.5.5/32         user     1                    ulst 262142     5
                              10.0.24.4          ucst  1303     2 fe-0/1/4.24
                              10.0.35.4          ucst  1304     2 fe-0/1/5.35

The backup hop is already programmed ready to take over as soon as the primary fails.

Check-out pass for Logical system multiplexer process (/usr/sbin/lrmuxd) dumped core (0x86)

I’ve been getting this strange error on my M7i logical-system lab router recently.

I’m currently running 12.3R3.4:

darreno@M7i> show version
Hostname: M7i
Model: m7i
JUNOS Base OS boot [12.3R3.4]

When trying to commit now and then I see the following error:

darreno@M7i# commit and-quit
error: Check-out pass for Logical system multiplexer process (/usr/sbin/lrmuxd) dumped core (0x86)
error: configuration check-out failed

Nothing is wrong with the config itself. I’ve found a quick way to get rid of it is to add a new interface to a logical system and then remove it straight away. The config then commits:

[edit]
darreno@M7i# set logical-systems R4 interfaces fe-0/2/8.0

[edit]
darreno@M7i# delete logical-systems R4 interfaces fe-0/2/8.0

[edit]
darreno@M7i# commit and-quit
commit complete
Exiting configuration mode

VPLS on Junos signalled via LDP or BGP

Continuing on from the L2VPN on Junos post, let’s switch focus to VPLS. CCC is a point to point technology and so out of the question. That leaves both LDP and BGP to do our VC label signalling. As always, you can use either LDP or RSVP for your transport label signalling.

Slightly different topology this time, as I’m using to test different ways for the CE to attach to the VPLS. For now we’ll simply focus on T1, C2, and T2:

All three CE WAN interfaces are in the same subnet running OSPF. The goal is for them to be able to reach each other’s loopbacks. As far as the CE devices are concerned, they are simply plugged into a ‘big switch’

LDP

I’ll concentrate on the PE R3 for this example. We first need to let the router know that the interface pointing towards T1 will in fact be a VPLS interface:

darreno@M7i> show configuration interfaces fe-0/0/1
encapsulation ethernet-vpls;
unit 0;

Our regular RSVP MPLS config, nothing special. Note that LDP is configured for the loopback interface:

darreno@M7i> show configuration protocols
rsvp {
    interface all;
}
mpls {
    label-switched-path TO-R6 {
        to 6.6.6.6;
        no-cspf;
    }
    label-switched-path TO-R7 {
        to 7.7.7.7;
        no-cspf;
    }
    interface all;
}
ospf {
    traffic-engineering;
    area 0.0.0.0 {
        interface all;
    }
}
ldp {
    interface lo0.3;
}

Finally the LDP VPLS config itself. As there is no auto-discovery you need to let Junos know what other PE routers are participating in this VPLS:

darreno@M7i> show configuration routing-instances
VPLS1 {
    instance-type vpls;
    interface fe-0/0/1.0;
    protocols {
        vpls {
            vpls-id 1;
            neighbor 6.6.6.6;
            neighbor 7.7.7.7;
        }
    }
}

I’ve matched the above configs on R6 and R7. Let’s take a look at the network from T1’s perspective:

USERT1@M7i:T1> show ospf neighbor
Address          Interface              State     ID               Pri  Dead
192.168.0.2      fe-0/1/0.0             Full      12.12.12.12      128    37
192.168.0.3      fe-0/1/0.0             Full      14.14.14.14      128    36

USERT1@M7i:T1> show route protocol ospf

inet.0: 9 destinations, 9 routes (9 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

12.12.12.12/32     *[OSPF/10] 00:05:43, metric 1
                    > to 192.168.0.2 via fe-0/1/0.0
14.14.14.14/32     *[OSPF/10] 00:27:15, metric 1
                    > to 192.168.0.3 via fe-0/1/0.0
224.0.0.5/32       *[OSPF/10] 2d 06:44:41, metric 1
                      MultiRecv

USERT1@M7i:T1> ping 12.12.12.12 rapid count 30
PING 12.12.12.12 (12.12.12.12): 56 data bytes
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
--- 12.12.12.12 ping statistics ---
30 packets transmitted, 30 packets received, 0% packet loss
round-trip min/avg/max/stddev = 1.085/1.338/6.357/0.935 ms

USERT1@M7i:T1> ping 14.14.14.14 rapid count 30
PING 14.14.14.14 (14.14.14.14): 56 data bytes
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
--- 14.14.14.14 ping statistics ---
30 packets transmitted, 30 packets received, 0% packet loss
round-trip min/avg/max/stddev = 1.081/1.496/11.077/1.784 ms

T1 considers T2 and C2 to be directly connected via L2. There is an OSPF neighbourship between all three and routes are learned. The data plane is also functioning correctly.

BGP

Let’s turn our attention now to BGP. There are a number of advantages to using BGP, especially if you already run BGP in the SP network. There is another address family which will not only advertise VC labels between PE routers, it will also allow PE routers to auto-discover any other PE configured in the same VPLS.

I’ll keep the interface config the same as above. You may notice that there is more configuration for the BGP version, but in the long run there is less config as that same BGP session is good for all your VPLS instances on the PE.

Let’s start with our BGP config:

darreno@M7i> show configuration routing-options autonomous-system
100;

darreno@M7i> show configuration protocols bgp
group iBGP {
    local-address 3.3.3.3;
    family l2vpn {
        signaling;
    }
    peer-as 100;
    neighbor 6.6.6.6;
    neighbor 7.7.7.7;
}

The BGP VPLS config is slightly different. We now have site-identifiers, but no manual neighbour config. As with our L3VPN set up, we need both RD and RTs configured.

darreno@M7i> show configuration routing-instances
VPLS1 {
    instance-type vpls;
    interface fe-0/0/1.0;
    route-distinguisher 100:200;
    vrf-target target:100:200;
    protocols {
        vpls {
            site T1 {
                site-identifier 1;
            }
        }
    }
}

We test from our CE once again:

USERT1@M7i:T1> show ospf neighbor
Address          Interface              State     ID               Pri  Dead
192.168.0.2      fe-0/1/0.0             Full      12.12.12.12      128    34
192.168.0.3      fe-0/1/0.0             Full      14.14.14.14      128    36

USERT1@M7i:T1> show route protocol ospf

inet.0: 9 destinations, 9 routes (9 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

12.12.12.12/32     *[OSPF/10] 00:03:34, metric 1
                    > to 192.168.0.2 via fe-0/1/0.0
14.14.14.14/32     *[OSPF/10] 00:04:26, metric 1
                    > to 192.168.0.3 via fe-0/1/0.0
224.0.0.5/32       *[OSPF/10] 2d 07:00:30, metric 1
                      MultiRecv

USERT1@M7i:T1> ping 12.12.12.12 rapid count 30
PING 12.12.12.12 (12.12.12.12): 56 data bytes
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
--- 12.12.12.12 ping statistics ---
30 packets transmitted, 30 packets received, 0% packet loss
round-trip min/avg/max/stddev = 1.061/1.480/10.779/1.728 ms

USERT1@M7i:T1> ping 14.14.14.14 rapid count 30
PING 14.14.14.14 (14.14.14.14): 56 data bytes
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
--- 14.14.14.14 ping statistics ---
30 packets transmitted, 30 packets received, 0% packet loss
round-trip min/avg/max/stddev = 1.079/1.183/1.394/0.088 ms

LDP & BGP

There is another way to get this to work. You can use BGP for auto-discovery, while using LDP to advertise the VC labels. This is the same way Brocade Netiron boxes do this, and inter-op is the only reason I would do it this way. If you have BGP running already, why not just let it do both discovery and VC assignment?

The configuration on PE R3 has been changed as follows:

darreno@M7i> show configuration protocols bgp
group iBGP {
    local-address 3.3.3.3;
    family l2vpn {
        auto-discovery-only;
    }
    peer-as 100;
    neighbor 6.6.6.6;
    neighbor 7.7.7.7;
}

darreno@M7i> show configuration routing-instances
VPLS1 {
    instance-type vpls;
    interface fe-0/0/1.0;
    route-distinguisher 100:200;
    l2vpn-id l2vpn-id:100:200;
    vrf-target target:100:200;
    protocols {
        vpls;
    }
}

CE-CE connectivity has been tested as above with no issues at all.