Odd behaviour of down interfaces and OSPF on Brocade Netiron

I ran across an odd problem on a Brocade XMR recently. I had created a static route redistributed into OSPF but ended up with a traffic loop. I managed to figure out why this was happening, but the behaviour I see should not be happening. Note I’ve changed the first octect to 10. in all the below addresses.

Consider this network:

R1, R2, and R3 are on OSPF area 0. R4 has the prefix 10.46.204.0/29 sitting behind it. R3 has a static route to this prefix via R4 and that route is getting redistributed into OSPF. R4 has a static default to R3.

My PC is connected to the core network, so let’s do a traceroute:

H:\>tracert -d 10.46.204.1

Tracing route to 10.46.204.1 over a maximum of 30 hops

  1    <1 ms    <1 ms     5 ms  10.196.226.126
  2    <1 ms    <1 ms    <1 ms  10.255.1.1
  3     1 ms     1 ms     1 ms  10.255.0.20
  4     1 ms     1 ms     1 ms  10.71.16.198
  5     1 ms     1 ms     1 ms  10.248.31.29
  6     1 ms     1 ms     1 ms  10.248.31.10
  7     1 ms     1 ms     1 ms  10.248.31.2
  8     8 ms     1 ms     1 ms  10.248.31.61
  9     2 ms     1 ms     1 ms  10.248.31.17
 10     2 ms     1 ms     1 ms  10.248.31.16
 11     2 ms     2 ms     2 ms  10.248.31.17
 12     2 ms    12 ms     2 ms  10.248.31.16
 13     2 ms     2 ms     2 ms  10.248.31.17
 14     2 ms    15 ms     4 ms  10.248.31.16
 15     3 ms     3 ms    14 ms  10.248.31.17
 16    14 ms     4 ms     3 ms  10.248.31.16
 17     3 ms    14 ms     3 ms  10.248.31.17
 18     4 ms     3 ms    14 ms  10.248.31.16
 19    14 ms     4 ms     3 ms  10.248.31.17
 20     4 ms    13 ms     5 ms  10.248.31.16
 21     5 ms     4 ms    14 ms  10.248.31.17
 22    14 ms     5 ms     4 ms  10.248.31.16
 23    15 ms     5 ms     4 ms  10.248.31.17
 24     4 ms    14 ms     6 ms  10.248.31.16
 25     6 ms    17 ms     5 ms  10.248.31.17
 26     6 ms    16 ms     6 ms  10.248.31.16
 27     6 ms    16 ms     6 ms  10.248.31.17
 28     6 ms    16 ms     6 ms  10.248.31.16
 29     7 ms    17 ms     6 ms  10.248.31.17
 30     6 ms    16 ms     7 ms  10.248.31.16

Trace complete.

Clearly we have a problem.

Troubleshooting

Let's look at all three router's view of that route.

R1:

[email protected]#sh ip route 10.46.204.1
Type Codes - B:BGP D:Connected I:ISIS O:OSPF R:RIP S:Static; Cost - Dist/Metric
BGP  Codes - i:iBGP e:eBGP
ISIS Codes - L1:Level-1 L2:Level-2
OSPF Codes - i:Inter Area 1:External Type 1 2:External Type 2 s:Sham Link
STATIC Codes - d:DHCPv6
        Destination        Gateway         Port          Cost          Type Uptime src-vrf
1       10.46.204.0/29     10.248.31.17    ve 62         110/10        O2   34m19s -

R2:

[email protected]#sh ip route 10.46.204.1
Type Codes - B:BGP D:Connected I:ISIS O:OSPF R:RIP S:Static; Cost - Dist/Metric
BGP  Codes - i:iBGP e:eBGP
ISIS Codes - L1:Level-1 L2:Level-2
OSPF Codes - i:Inter Area 1:External Type 1 2:External Type 2 s:Sham Link
STATIC Codes - d:DHCPv6
        Destination        Gateway         Port          Cost          Type Uptime src-vrf
1       0.0.0.0/0          10.248.31.16    ve 62         110/210       O1   1h7m   -

R3:

[email protected]#sh ip route 10.46.204.1
Type Codes - B:BGP D:Connected I:ISIS O:OSPF R:RIP S:Static; Cost - Dist/Metric
BGP  Codes - i:iBGP e:eBGP
ISIS Codes - L1:Level-1 L2:Level-2
OSPF Codes - i:Inter Area 1:External Type 1 2:External Type 2 s:Sham Link
STATIC Codes - d:DHCPv6
        Destination        Gateway         Port          Cost          Type Uptime src-vrf
1       10.46.204.0/29     10.248.31.22    eth 2/1       1/1           S    1d2h   -

R2 is clearly wrong. It doesn't have the route in it's table and therefore is following the default route back to R1.

Let's check the OSPF LSA for this prefix:

[email protected]#sh ip ospf database external-link-state link-state-id 10.46.204.0
Ospf ext link-state by link-state ID 10.46.204.0 are in the following:

Type-5 AS External Link States

Index Age  LS ID           Router          Netmask  Metric   Flag Fwd Address   SyncState
444   1387 10.46.204.0     10.196.224.61  fffffff8 0000000a 0000 10.248.31.22   Done
  LSA Header:  age: 1387, options: 0x02, seq-nbr: 0x80000067, length: 36
  NetworkMask: 255.255.255.248
  TOS 0:  metric_type: 2, metric: 10
          forwarding_address: 10.248.31.22
          external_route_tag: 0

I don't see anything wrong here. The metric is valid. The forwarding address is also reachable. Let's check the route to the forwarding address on R2:

[email protected]#sh ip route 10.248.31.22
Type Codes - B:BGP D:Connected I:ISIS O:OSPF R:RIP S:Static; Cost - Dist/Metric
BGP  Codes - i:iBGP e:eBGP
ISIS Codes - L1:Level-1 L2:Level-2
OSPF Codes - i:Inter Area 1:External Type 1 2:External Type 2 s:Sham Link
STATIC Codes - d:DHCPv6
        Destination        Gateway         Port          Cost          Type Uptime src-vrf
1       10.248.31.22/31    10.248.31.21    ve 5          110/200       O    1h59m  -

It's reachable over the directly connected interface which is valid. So what's going on?

After taking a look around I saw that 10.248.31.22 was configured on a local interface on R2. However that interface was down. If an interface is down it should not consider that to be an active route. In fact the show ip route command above the router knows that 10.2148.31.22 is a remote address.

This is the local interface on R2 in question:

[email protected]#sh run int eth 2/1
interface ethernet 2/1
 port-name R3 eth2/1
 enable
 route-only
 ip ospf area 0
 ip ospf network point-to-point
 ip address 10.248.31.22/31

[email protected]#sh int eth 2/1 | include protocol
GigabitEthernet2/1 is down, line protocol is down

I tested the exact same config above in both Junos and IOS and both systems do not have the same problem. It doesn't matter if the forwrding address is a locally configured address as long as that address is not active

OSPF forwarding address

Why does OSPF set the forwarding address anyway? Most type5 LSAs wil have a forwarding address of 0.0.0.0 which effectively means to forward packets to the destination to the ASBR that originated the Type5. There are some very specific circumstances as to when an ASBR sets the forwarding address to a non-0.0.0.0 value which can be found on this site http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a008009405a.shtml?referring_site=bodynav

At least on IOS, this is what Cisco is saying:

The forwarding address is set to 0.0.0.0 if the ASBR redistributes routes and OSPF is not enabled on the next hop interface for those routes. This is true in the figure if Router 1 does not have OSPF enabled on the Ethernet interface.

These conditions set the forwarding address field to a non-zero address:

OSPF is enabled on the ASBR's next hop interface AND

ASBR's next hop interface is non-passive under OSPF AND

ASBR's next hop interface is not point-to-point AND

ASBR's next hop interface is not point-to-multipoint AND

ASBR's next hop interface address falls under the network range specified in the router ospf command.

Any other conditions besides these set the forwarding address to 0.0.0.0.

R3's link to R4 in my topology does indeed have ospf enabled on the link, but I am running passive. Brocade doesn't seem to have thorough documentation on this so it's behaviour is slightly different to Cisco. Let's remove OSPF from the interface though and check the type 5 LSA again:

[email protected]#conf t
[email protected](config)#int eth 2/1
[email protected](config-if-e1000-2/1)#no ip ospf area 0
[email protected]#sh ip ospf database external-link-state link-state-id 10.46.204.0
Ospf ext link-state by link-state ID 10.46.204.0 are in the following:

Type-5 AS External Link States

Index Age  LS ID           Router          Netmask  Metric   Flag Fwd Address   SyncState
444   15   10.46.204.0     10.196.224.61  fffffff8 0000000a 0000 0.0.0.0        Done
  LSA Header:  age: 15, options: 0x02, seq-nbr: 0x80000069, length: 36
  NetworkMask: 255.255.255.248
  TOS 0:  metric_type: 2, metric: 10
          forwarding_address: 0.0.0.0
          external_route_tag: 0

The forward address has changed, which should mean the route is installed:

[email protected]#sh ip route 10.46.204.0
Type Codes - B:BGP D:Connected I:ISIS O:OSPF R:RIP S:Static; Cost - Dist/Metric
BGP  Codes - i:iBGP e:eBGP
ISIS Codes - L1:Level-1 L2:Level-2
OSPF Codes - i:Inter Area 1:External Type 1 2:External Type 2 s:Sham Link
STATIC Codes - d:DHCPv6
        Destination        Gateway         Port          Cost          Type Uptime src-vrf
1       10.46.204.0/29     10.248.31.21    ve 5          110/10        O2   0m48s  -

Conclusion

What matters is the running state of a device. Whether a forwarding address is configured on an interface or not is irrelevant if the interface is not active. The interface configured with an identical address is not up. I even hard shut the interface but that made no difference.

Either way, it's an interesting quirk to know.

© 2009-2019 Darren O'Connor All Rights Reserved -- Copyright notice by Blog Copyright