OSPF Fast Re-Route and BFD on Junos

One of the few advantages that EIGRP had over OSPF and IS-IS was that it had feasable successors. That is the router had already pre-calculated a route to a destination over a backup, non-looping, path.

OSPF and IS-Is has had this for sometime now on both IOS and Junos. It’s also supported on IOS-XR.

This post will mainly go over OSPF. The process is nearly identical for IS-IS.

To start I’ll be using the following topology:

R3 has two links to R4. This is going through a switch which will allow us to bring the link down without pulling the interface down. I’m configuring a cost of 100 on the first link and 1000 on the second as I don’t want to bring ECMP into play for this post.

How does a router know it’s neighbour is down? If the interface goes down the detection will be quick. If the interface stays up, but something alone the path is dropping packets, the router will take quite a long time to detect this.

If we leave OSPF to its defaults, it could be 40 seconds before R3 realises it cannot get to R4 over their primary interface (Standard dead timer on broadcast links). Until that happens R3 will be sending packets into the void.

I’ll set up standard OSPF on all interfaces. From R2 I’ll be sending pings to R5’s loopback. R3 and R4 are both tagged interfaces in different vlans. On the switch I can simply remove vlan 24 which will cause packets to be dropped over that vlan.

OSPF – No tweaking

Standard OSPF here with no tweaks. I’ll be showing R3’s config here:

[email protected]> show configuration protocols ospf
area 0.0.0.0 {
    interface lo0.3;
    interface fe-0/1/4.24 {
        metric 100;
    }
    interface fe-0/1/5.35 {
        metric 1000;
    }
}

I’ll now initiate a ping flood from R2 to R5. Once that starts I’ll remove vlan 24 from the switch.

Let’s see how the ping flood goes:

!!!.....................................................................!!!

Not very good at all!

OSPF – BFD

Let’s add BFD to the OSPF session on both R3 and R4:

[email protected]> show configuration protocols ospf
area 0.0.0.0 {
    interface all;
    interface lo0.3;
    interface fe-0/1/4.24 {
        metric 100;
        bfd-liveness-detection {
            minimum-interval 50;
            minimum-receive-interval 30;
            multiplier 3;
        }
    }
    interface fe-0/1/5.35 {
        metric 1000;
        bfd-liveness-detection {
            minimum-interval 50;
            minimum-receive-interval 30;
            multiplier 3;
        }
    }
}

Do the same test as above.

!!!!.!!!

Much much better. Note that this is a very small topology though so LSAs are very quick to flood. If you had a larger topology, especially if it spans geographic regions it could take much longer for the new route to be calculated.

OSPF – BFD & FRR

Now I’ll add FRR to OSPF on R3. I’ll protect the fe-0/1/4.0 link from R3’s point of view. R3 will run SPF for all it’s destinations through that interface and will know if it can get to any destination through any other interfaces without being looped. In this simple topology any traffic sent over the higher metric interface to R4 will still get to R5 as R4 will not send it back.

First we enable link-protection:

[email protected]> show configuration protocols ospf area 0 interface fe-0/1/4.24
link-protection;
metric 100;
bfd-liveness-detection {
    minimum-interval 50;
    minimum-receive-interval 30;
    multiplier 3;
}

Junos will pre-calculate the routes, but it will NOT add it to the FIB by default. You have to enable more than one next-hop in the FIB:

[email protected]> show configuration policy-options policy-statement BALANCE
then {
    load-balance per-packet;
}

[email protected]> show configuration routing-options forwarding-table
export BALANCE;

Let’s run the same test as above again:

!!!!!!!!!!!!!!!!!!!!!!

I’m simply not losing any at all. The difference between BFD alone and BFD and link-protection is most pronounced on much larger topologies. Remember FRR is a router making a local repair quickly to get packets form A to B while an alternative regular route is calculated.

You can see that enabling FRR is a piece of cake. To verify you need to dig a little deeper. First let’s see the FRR coverage on R3:

[email protected]> show ospf backup coverage
Topology default coverage:

Node Coverage:

Area             Covered  Total  Percent
                   Nodes  Nodes  Covered
0.0.0.0                2      3   66.67%

Route Coverage:

Path Type  Covered   Total  Percent
            Routes  Routes  Covered
Intra            5      11   45.45%
Inter            0       0  100.00%
Ext1             0       0  100.00%
Ext2             0       0  100.00%
All              5      11   45.45%

Not every single prefix can be covered as it’s quite topology dependant. If we look into the detail for specifically 5.5.5.5:

[email protected]> show ospf backup spf detail | find 5.5.5.5
5.5.5.5
  Self to Destination Metric: 101
  Parent Node: 10.0.8.10
  Primary next-hop: fe-0/1/4.24 via 10.0.24.4
  Backup next-hop: fe-0/1/5.35 via 10.0.35.4
  Backup Neighbor: 4.4.4.4
    Neighbor to Destination Metric: 1, Neighbor to Self Metric: 1
    Self to Neighbor Metric: 100, Backup preference: 0x0
    Eligible, Reason: Contributes backup next-hop

Here we see that fe-0/1/4.24 is the primary and fe-0/1/5.35 is the backup. The backup is also eligible. If we take a look at the route itself:

[email protected]> show route 5.5.5.5

inet.0: 24 destinations, 25 routes (24 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

5.5.5.5/32         *[OSPF/10] 00:03:15, metric 101
                    > to 10.0.24.4 via fe-0/1/4.24
                      to 10.0.35.4 via fe-0/1/5.35

Both routes are there, but only the first will be used until it fails.

Finally we can take a look at the FIB entry:

[email protected]> show route forwarding-table destination 5.5.5.5
Routing table: default.inet
Internet:
Destination        Type RtRef Next hop           Type Index NhRef Netif
5.5.5.5/32         user     1                    ulst 262142     5
                              10.0.24.4          ucst  1303     2 fe-0/1/4.24
                              10.0.35.4          ucst  1304     2 fe-0/1/5.35

The backup hop is already programmed ready to take over as soon as the primary fails.