Category Archives: CCIE

When a vlan is not a vlan

What is a vlan? What is a vlan-id? Are they the same thing?

Generally yes, but in the ISP world a vlan-id can also be a circuit identifier. While your view of a vlan might be a single broadcast domain, you’ll soon see that multiple vlan IDs can share the same single broadcast domain, or the same vlan-id could be in a completely different broadcast domain.

The Problem

I’ve written about this before. Carriers, at least in the UK, are offering more and more aggregated links to Service Providers. Each circuit to customer sites is aggregated over a single high-bandwidth link to your PE router. This cuts down on ports, cables, and man hours to plug them in.

Old way:

carrier old When a vlan is not a vlan

New way:

carrier new When a vlan is not a vlan
How are the p2p circuits aggregated over the core high-bandwidth link? Each p2p link is separated by a vlan tag on the PoP side. So we could say that any packet coming out of the core PE with vlan 2000 goes to site 1, while packets with vlan 3000 go to site 2. What happens if site 1 and site 2 are going to the same customer? What if you are providing a VPLS service to them? It’s essential to note that the vlan tag imposed by the carrier is used simply to determine what packet goes to which circuit. As we control the MPLS core, it’s ultimately up to us to decide which packet belongs in which broadcast domain, and that is regardless of the vlan id used by the carrier.

Relevant Initial Core Config

I’ll use the following topology:
vlans core When a vlan is not a vlan

R1, R2, and R3 are the core of the network. R1 is a Brocade Netiron running MPLS. R2 is a Cisco me3600x running MPLS. R2 is an me3600x running bridge-groups with no MPLS.

CE1, CE2, and CE3 are all customer routers.

R1 – Brocade XMR

interface ethernet 2/4
 port-name TO-R2
 enable
 route-only
 ip ospf area 0
 ip ospf network point-to-point
 ip address 10.10.10.10/24
!
router mpls
 policy
  traffic-eng ospf area 0

  mpls-interface e2/4

 lsp R1-R2
  to 192.168.224.4
  adaptive
  enable

R2 – Cisco me3600x running MPLS

mpls traffic-eng tunnels
!
router ospf 1
 mpls traffic-eng router-id Loopback0
 mpls traffic-eng area 0
!
interface GigabitEthernet0/1
 description TO-R1
 no switchport
 ip address 10.10.10.11 255.255.255.0
 ip ospf network point-to-point
 ip ospf 1 area 0
 mpls traffic-eng tunnels
!
interface Tunnel0
 ip unnumbered Loopback0
 tunnel mode mpls traffic-eng
 tunnel destination 192.168.224.61
 tunnel mpls traffic-eng autoroute announce
 tunnel mpls traffic-eng path-option 5 dynamic
 tunnel mpls traffic-eng record-route

There is no IP and MPLS configuration on R3 as it’s not running MPLS. I’ll show how the bridge-group is configured when I get to that part.

CPE Config

I’ll be using vlan 3000 to get to CE1, vlan 2000 to get to CE2, and double-tag vlan 3500,2500 to get to CE3. Each CE has their WAN interface in the same subnet as each other running OSPF. I’ll also enable OSPF on their loopbacks and WAN links.

CE1

This is a Juniper EX3200:

root@CE1> show configuration interfaces ge-0/0/0
vlan-tagging;
unit 3000 {
    vlan-id 3000;
    family inet {
        address 1.1.1.1/24;
    }
}

root@CE1> show configuration interfaces lo0.0
family inet {
    address 10.10.10.10/32;
}

root@CE1> show configuration protocols ospf
area 0.0.0.0 {
    interface ge-0/0/0.3000;
    interface lo0.0;
}

CE2

This is a Cisco 3750G:

interface Loopback0
 ip address 20.20.20.20 255.255.255.255
 ip ospf 1 area 0
!
interface Vlan2000
 ip address 1.1.1.2 255.255.255.0
 ip ospf 1 area 0
!
interface GigabitEthernet1/0/1
 switchport trunk encapsulation dot1q
 switchport trunk allowed vlan 2000
 switchport mode trunk

CE3

This is a Cisco 1841:

interface Loopback0
 ip address 30.30.30.30 255.255.255.255
 ip ospf 1 area 0
!
interface FastEthernet0/0.32
 encapsulation dot1Q 3500 second-dot1q 2500
 ip address 1.1.1.3 255.255.255.0
 ip ospf 1 area 0

VPLS Config

As you can see, each CPE will be using a different vlan tag. One site is even sending a double-tagged frame. They all need to be in the same broadcast domain. No problem as we are simply going to use the vlan tag to determine the service.

R2

Gi0/2 will create a LDP-signalled VPLS VC to R1 (aka manual set up). Interface gi0/2 vlan 2000 will be part of VPLS id 501:

ethernet evc TEST-EVC
 uni count 20
!
l2vpn vfi context TEST-VPLS
 vpn id 501
 member 192.168.224.61 encapsulation mpls
!
interface GigabitEthernet0/2
 switchport trunk allowed vlan none
 switchport mode trunk
 mtu 9800
 service instance 1 ethernet TEST-EVC
  encapsulation dot1q 2000
  rewrite ingress tag pop 1 symmetric
  bridge-domain 501
 !
interface Vlan501
 no ip address
 member vfi TEST-VPLS

What’s important to note here is that the me3600x still uses bridge-groups for VPLS, but it’s not exactly the same as just using bridge-groups by itself. You’ll see this soon enough when we configure R3.

R1

R1 will create a VPLS to R2. Vlan 3000 on interface 2/5 will be part of the same VPLS:

router mpls
 vpls TEST-VPLS 501
  vpls-peer 192.168.224.4
  vpls-mtu 1500
  vlan 3000
   tagged ethe 2/5

At this point R1 and R2 have the VPLS set up between them. Each CE is using different vlans on their WAN, but they are in fact on the same broadcast domain:

CE2#sh ip ospf neighbor

Neighbor ID     Pri   State           Dead Time   Address         Interface
1.1.1.1         128   FULL/DR         00:00:39    1.1.1.1         Vlan2000

CE2#ping 1.1.1.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 1.1.1.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/5/17 ms

CE2#ping 10.10.10.10 so lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.10.10.10, timeout is 2 seconds:
Packet sent with a source address of 20.20.20.20
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/4/9 ms

The vlan-id used on the CPE, was merely used to push the frame into the correct VPLS. The VPLS itself is the broadcast domain, the vlan tag is irrelevant as its stripped on inbound into the PE router. You CAN however, ensure that the PE router does NOT strip the vlan tag. This has interesting use cases when you purposely want to separate on vlan id with in the VPLS. I wrote more on this over here so please give it a read. Both the Brocade and Cisco default to VC mode 5 when setting up a VPLS.

Bridge Group Config

I’m going to set up R3 so that it only uses bridge-groups. No routing or MPLS involved. Bridge-Groups work very similar to VPLS, though it’s on a single box. Traffic can be pushed from a bridge-group into a VPLS if needed. The bridge-group determines the broadcast domain. I can have multiple different vlans in the same bridge group.

For R3, gi0/2 is the interface pointing towards the core, while gi0/1 is pointing towards the customer. I’ll use different vlan ids on each, but they will be in the same bridge-group:

ethernet evc TEST
!
vlan 501
 name TEST-CE
!
interface GigabitEthernet0/1
 switchport trunk allowed vlan none
 switchport mode trunk
 service instance 1 ethernet TEST
  encapsulation dot1q 501
  rewrite ingress tag pop 1 symmetric
  bridge-domain 501
 !
interface GigabitEthernet0/2
 switchport trunk allowed vlan none
 switchport mode trunk
 service instance 1 ethernet TEST
  encapsulation dot1q 3500 second-dot1q 2500
  rewrite ingress tag pop 2 symmetric
  bridge-domain 501

I’m not going into detail, but I will cover the basics. When gi0/2 receives a double-tagged frame that matches 3500,2500 inbound, the me3600x will pop both tags off and the resulting frame will be part of bridge-group 501. Symmetric means that when a frame leaves gi0/2, it will re-add vlans 3500,2500 on top of the frame. As gi0/1 is also in bridge-group 501, the customer frame will be forwarded out that port, and it will have a single vlan tag of 501 popped on top.

At this point gi0/1 is connected to R1 eth2/3. For this customer I would be expecting a single tag of 501 coming inbound, and so I’ll place that vlan id into the VPLS from above:

 vpls TEST-VPLS 501
  vlan 501
   tagged ethe 2/3

Now all three CE routers should be fully adjacent:

CE3#sh ip ospf neighbor

Neighbor ID     Pri   State           Dead Time   Address         Interface
1.1.1.1         128   FULL/DR         00:00:35    1.1.1.1         FastEthernet0/0.32
1.1.1.2           1   FULL/DROTHER    00:00:37    1.1.1.2         FastEthernet0/0.32

CE3#ping 10.10.10.10 so lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.10.10.10, timeout is 2 seconds:
Packet sent with a source address of 30.30.30.30
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/4 ms
CE3#ping 20.20.20.20 so lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 20.20.20.20, timeout is 2 seconds:
Packet sent with a source address of 30.30.30.30
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/4/12 ms

Conclusions:

vlan tags have multiple uses. In most networks it informs the switches which vlan, and therefore broadcast domain, a frame is part of. They can also be circuit identifiers showing which VPLS/Circuit the frame belongs to. They can also be both at the same time, depending on the VPLS VC type you’re using.

For the above network it’s extremely simplified. Care must be taken when forwarding certain layer2 control frames. Most are sent untagged out tagged interfaces. Cisco’s RSTP+ and STP tag each vlan BPDU with a the same vlan-id. If you’re using vlan 2000 on one side and vlan 3000 on the other, and the BPDU gets through, one side will shut down their WAN link due to receiving a BPDU with a vlan tag that doesn’t match the BPDU data inside the frame.

My own view on the whole ‘CCIE is getting less important’ debate

There have been numerous debates recently about how important the CCIE is going forward. My views seem to differ form a lot of others so allow me to get on my own soapbox.

What I am sick of hearing is how a CCIE is just a ‘cli jockey’ – I have not, and have never, considered myself to be that. I’m also sick of hearing that you’re either this or you’re that. People are far more complicated that being either one thing or another.

Different people have different reasons for pursuing the CCIE. I did it because I wanted to prove to myself I could do it. Some people do it to get more money, others to get a job, others for who knows what reason. The CCIE is not an end into itself, it’s merely something you can do if you want.

Does having a CCIE mean that all I’m good at is punching some commands into a router? No, and I find is extremely insulting for anyone to think such a thing. I took a view of learning how things work and fit together. The CLI is there simply to allow you to get to your end goal. The actual commands for me were the easiest part of the entire lab exam. Learning how things fit together and how protocols operate is the real business side of the CCIE in my opinion. This is perhaps why I was able to do the CCIE SP so quickly after doing my JNCIE-SP. The commands, defaults, and capabilities of each is the only difference. The technology is the same.

The same goes for programming really. You have something you want to do. You could use a load of different languages to get there. The language you use is irrelevant to the underlying problem you are trying to solve.

If you think that the CCIE is merely there to teach you to be a ‘cli jockey’ then I feel pretty sorry for you.

The thought that a programmer is inherently better at designing a network than a network engineer is, is laughable. The very thought that you are either one or the other is also laughable. The thought that a programmer is a better problem solver because they are able to break down big problems into smaller ones is also incorrect. We’ve all been doing this for ages. The very act of troubleshooting and designing large networks is all about this. You don’t just draw one big cloud and say there’s your network. The devils in the smaller details and how each small part operates in the bigger picture to give you your end goal. This could be applied to so many things.

For me, learning itself is one of my favourite processes. Learning anything. Heck if I could somehow spend 6 months every year doing nothing but attending lectures and studying I would do it at the drop of a hat. It’s what I love to do. CCIE before, Python next. Then what? Maybe Physics? Who knows. Will it help me with my career? Probably not, but I’m pretty sure I’ll enjoy it nonetheless!

CCIE Service Provider – My Thoughts

I’m happy to report that my first SP lab attempt in Brussels on Monday the 10th was a success! I must admit, like the JNCIE-SP I found it rather easy. MPLS/BGP is something I work with on a daily basis and so I wasn’t expecting anything out of the ordinary. Here I document my process and thoughts.

Sunday 9th March

My wife and I board the Eurostar bound for Brussels. Our train departs nice and early at 08:58. The weather for today is sunny and hot for this time of year and everything so far is good.

About 20 minutes outside Brussels, the train stops. After 10 more minutes there is an announcement that there is a power issue and the train can’t move. They try a number of things including restarting the power system on the train, but no good. They eventually blame Brussels Rail for not giving power, but it doesn’t move us anywhere. As it’s a hot and sunny day, and Eurostar has no open windows, the temperature rapidly rises. Some poor lady in our carriage passes out eventually. After 2.5 hours of sitting in the heat, two diesel coaches are dispatched to tow us into Brussels.

Problems happen, I understand that. But I was very annoyed that no refreshments were given out to anyone considering it was baking in that train.

The diesel coaches are quite slow, but we finally pull into Brussels:



Tip #1 – Always ensure you plan to arrive well in advance. If this happened in the middle of the night I would’ve probably missed my lab!

Once in Brussels, my wife and I head off to get some water and food. We then buy tickets for the short ride to Diegem. Most CCIE candidates doing their exam in Brussels tend to use the NH hotel. As I’m paying for this all myself I go cheaper and get the Ibis hotel which is about a 15 minute walk from Cisco. This has a few benefits. First, it’s cheaper. Second, the hotel isn’t actually that bad. Third, the 15 minute walk in the morning allows me to clear my mind and get my blood flowing. Better than dragging yourself the 100 or so meters from NH hotel to Cisco.

Tip #2 – You don’t have to go for the close fancy hotel. You need a bed, shower, and food. And hopefully something quiet. If it’s a kilometre or so away from the testing centre, use the walk to breathe fresh air and clear your mind.

I take is easy in the hotel. We’ve arrived just over three hours later than I planned, but as it’s still early afternoon there is no problem. We have a basic dinner, nothing too fancy, and I plan for an early night.

I did fall asleep quite quickly, but my rest wasn’t the best. I woke up a number of times and struggled to get a good nights sleep.


Odd, as I’ve never had issues before a lab before. I did get a good nights sleep the night before, so I’m hoping it won’t affect me too bad.

Monday 10th March

I get up, have a coffee, have some cereal. No big heavy breakfast for me thanks. I’ve got a couple of snacks to take with me. I’m allowed to take snacks into the testing room in Diegem, but the mobile lab in London forbid me from doing so. ALWAYS check before!

I ended up leaving the hotel about 10 minutes later than I wanted. Cisco open the doors at 08:00 sharp and the exam starts at 08:15. Generally candidates stand on the small bridge outside the building. By the time I got there at 08:02 the doors were already open and everybody was inside waiting for the proctor.

Just before 08:15 the proctor comes in and tells us to follow him. We go into the lift to the 3rd/4th floor, don’t remember the exact. He informs us to sign in and sit at our assigned desks. I believe I was the only SP candidate of the day.

The proctor informs us when lunch will be and what time we should finish. He also explains the rules and so on. After that we log in and begin. There is a big difference between the R&S and SP track in that the R&S has two sections, both of which you need to pass. You have two hours to fix 10 tickets and so you don’t really have time to mess around before starting. The SP has no defined TS section so my strategy was different. Everything is still on-screen, but you do have paper and multiple colours of pens and highlighters. I spent the first 30 minutes reading every section beginning to end so I knew what I was getting myself into. I then copied the overall diagram on paper and used coloured pens to write notes about what I was going to do where

Tip #4 – Read everything and plan your attack for the configuration section as quickly as you can. It doesn’t have to be perfect first time!

Once all the above is done I open the questions again and start with the first section and just crack on. At this point I do verify, but my verification is minimal. I want to do as much as I can before lunch which should then leave me plenty of time to verify.

Around 12:05 the proctor informs us that we are going to break for lunch in about 5 minutes. I had just finished a big section and didn’t want to start another big section with 5 minutes so I go back to my initial drawing and make a few alterations and fixes.

We are told to stop and get a voucher from the proctor. We are told we are to sit together, speak only in English, and not talk about the exam. This time I made sure I was in front of the queue. Last time it took 25 minutes for me to get food which left me with 5 minutes to eat everything! We get down to the caffiteria and I decide to go for steak. Why not? I’ve had a light breakfast and quite hungry. Besides I need to make sure this lunch is worth the exam fee in case the worst happens ;) – They load my plate with fries and some salad.

I finish the steak quickly and only eat a few fried. I don’t want to fall asleep upstairs after all. After 30 minutes we head back upstairs and the proctor informs us we can carry on as soon as we get to our desks.

About 5 minutes after we are back, one candidate leaves. Not sure if he finished really quickly or just gave up. Either way he is gone. 30 minutes later the guy sitting in front of me gets up with an odd look on his face. I don’t know if he is about to have a nervous breakdown, but he doesn’t look to happy. I’m pretty sure I heard him curse under his breath a number of times. Another candidate has about 12 cans of coke on his desk, and not the diet kind either. Not sure how that can be healthy.

It was at this point that I was checking something on a router and I lost access to it. Trying to telnet back into it failed. Checking the interface status on it’s neighbour the interface was down. Looked like a router crash… I inform the proctor who confirms it. He loads up the device tab and power cycles the affected router. That immediately reboots ALL of my IOS devices (which wasn’t supposed to happen!) – I save my configs pretty much every single time I configure anything on the router, so once all the routers are back everything is exactly how it was 10 minutes ago.

Tip #5 -Save often. Seriously.

About 90 minutes after we get back from lunch I’ve answered all the questions. I’ve got a few hours to verify which is exactly what I wanted.

Tip #6 – Nothing is configured until it’s verified! It’s essential that you leave AT LEAST one hour to verify. The more the better!

When verifying, you need to read the question again thoroughly. Think about what they are asking. If in doubt, ask the proctor for clarification. Think about how Cisco are going to verify your answer and check your work the same way. Be brutal with yourself. Something either is or is not, there are no maybes here. TCL scripting really comes into its own here, at least on the IOS devices. I’m not lying when I say I probably ran my TCL scripts over 50 times. Testing every conceivable failure scenario, checking I had full reachability, and ensuring no route churn. IOS-XR is a little more difficult as TCL isn’t supported on it, so I did manual testing wherever I had to.

I noticed at least three problems which were all fixed. Verify again, and again, and again.

About 30 minutes before end of time, I was all verified out. I decided to call it a day. Finish lab, hand paper back to proctor and make way way back. The long allowed me to think it over one last time. I felt pretty confident about it, but of course you can’t think anything until you get the result.

It’s at this point where you now check your email every two minutes. I waited 2.5 painful days for my R&S result, and a number of weeks for my JNCIE-SP result. That doesn’t stop you from checking though!

The Result

Soon after 18:00 local time, I get a Cisco email… That was fast, I wonder if it’s good news?! – I log into the CCIE portal to get the result and that’s when I see it:
Capture 300x97 CCIE Service Provider   My Thoughts

Cue Celebrations!

So now the question is, what next?

OSPF as the PE-CE routing protocols deep dive – Part 3 of 3 – Loop Prevention

Read part 1
Read part 2
Read part 3

 
When customer sites are single-homed, there is no possibility of a loop forming, unless of course your customer decides to set up a bunch of GRE tunnels and run OSPF over that, but I digress. If a site is multi-homed, or two sites have a back-door between them, it’s essential that route from BGP going into OSPF, do not go back into BGP.

Let’s create a slightly different diagram for this one. R3 is now also a PE router:
loop ospf OSPF as the PE CE routing protocols deep dive – Part 3 of 3 – Loop Prevention

The loop prevention used ultimately depends on whether a prefix comes in as internal or external. If a sham-link is configured and all OSPF routes are intra-area, no loop prevention is needed. Standard SPF is run everything is fine. This is because everything is seen in area 0, and SPF can run with full knowledge of the entire area.

As soon as type3s and type5s are used, OSPF becomes a little more distance vector like. ABRs/ASBRs originate new LSAs and other OSPF router believe what is told to them. This makes is possible for loops to appear when multual redistribution is occuring.

The down bit

Let’s go back to RFC 4577, specifically section 4.2.5.1

When a type 3 LSA is sent from a PE router to a CE router, the DN bit [OSPF-DN] in the LSA Options field MUST be set. This is used to ensure that if any CE router sends this type 3 LSA to a PE router, the PE router will not redistribute it further.

When a PE router needs to distribute to a CE router a route that comes from a site outside the latter’s OSPF domain, the PE router presents itself as an ASBR (Autonomous System Border Router), and distributes the route in a type 5 LSA. The DN bit [OSPF-DN] MUST be set in these LSAs to ensure that they will be ignored by any other PE routers that receive them.

There are deployed implementations that do not set the DN bit, but instead use OSPF route tagging to ensure that a type 5 LSA generated by a PE router will be ignored by any other PE router that may receive it. A special OSPF route tag, which we will call the VPN Route Tag (see Section 4.2.5.2), is used for this purpose. To ensure backward compatibility, all implementations adhering to this specification MUST by default support the VPN Route Tag procedures specified in Sections 4.2.5.2, 4.2.8.1, and 4.2.8.2. When it is no longer necessary to use the VPN Route Tag in a particular deployment, its use (both sending and receiving) may be disabled by configuration.

Essentially, if an LSA arrives at a PE with the down bit set, that will never be redistributed into BGP. This prevents the route from leaking in from one PE back into another PE.

Down Bit – IOS

R7 is advertising it’s loopback address. No sham-links are used and so R4 will originate a type3 LSA to R6:

R6#show ip ospf database summary 7.7.7.7  adv-router 4.4.4.4

            OSPF Router with ID (6.6.6.6) (Process ID 1)

                Summary Net Link States (Area 0)

  Routing Bit Set on this LSA in topology Base with MTID 0
  LS age: 441
  Options: (No TOS-capability, DC, Downward)
  LS Type: Summary Links(Network)
  Link State ID: 7.7.7.7 (summary Network Number)
  Advertising Router: 4.4.4.4
  LS Seq Number: 80000003
  Checksum: 0x5636
  Length: 28
  Network Mask: /32
        MTID: 0         Metric: 2

Options state ‘Downward’ – This LSA is flooded to R6 -> R5 -> R3. R3, another PE, will have the LSA (all databases need to match remember) but it will not use the LSA. The routing bit will not be set, and it will not redistribute that into BGP either:

R3#  show ip ospf database summary 7.7.7.7  adv-router 4.4.4.4

            OSPF Router with ID (10.0.35.3) (Process ID 1)

                Summary Net Link States (Area 0)

  LS age: 597
  Options: (No TOS-capability, DC, Downward)
  LS Type: Summary Links(Network)
  Link State ID: 7.7.7.7 (summary Network Number)
  Advertising Router: 4.4.4.4
  LS Seq Number: 80000003
  Checksum: 0x5636
  Length: 28
  Network Mask: /32
        MTID: 0         Metric: 2

The same happens vice-versa. Any LSA originated by R3 to R5, will be received but not used by R4.
loop ospf2 OSPF as the PE CE routing protocols deep dive – Part 3 of 3 – Loop Prevention

Down Bit – IOS-XR

No change in IOS-XR behaviour. You need to be sure your domain-ids match to get a type3 between IOS and IOS-XE:

R6#sh ip ospf database summary 7.7.7.7 adv-router 4.4.4.4

            OSPF Router with ID (6.6.6.6) (Process ID 1)

                Summary Net Link States (Area 0)

  Routing Bit Set on this LSA in topology Base with MTID 0
  LS age: 20
  Options: (No TOS-capability, DC, Downward)
  LS Type: Summary Links(Network)
  Link State ID: 7.7.7.7 (summary Network Number)
  Advertising Router: 4.4.4.4
  LS Seq Number: 80000001
  Checksum: 0x5A34
  Length: 28
  Network Mask: /32
        MTID: 0         Metric: 2

Down bit set on the type3.

Route tags – IOS

Let’s go back to the RFC to see what this is all about. Section 4.2.5.2

If a particular VRF in a PE is associated with an instance of OSPF, then by default it MUST be configured with a special OSPF route tag value, which we call the VPN Route Tag. By default, this route tag MUST be included in the Type 5 LSAs that the PE originates (as the result of receiving a BGP-distributed VPN-IPv4 route, see Section 4.2.8) and sends to any of the attached CEs.

The configuration and inclusion of the VPN Route Tag is required for backward compatibility with deployed implementations that do not set the DN bit in type 5 LSAs. The inclusion of the VPN Route Tag may be disabled by configuration if it has been determined that it is no longer needed for backward compatibility.

The value of the VPN Route Tag is arbitrary but must be distinct from any OSPF Route Tag being used within the OSPF domain. Its value MUST therefore be configurable. If the Autonomous System number of the VPN backbone is two bytes long, the default value SHOULD be an automatically computed tag based on that Autonomous System number

If the Autonomous System number is four bytes long, then a Route Tag value MUST be configured, and it MUST be distinct from any Route Tag used within the VPN itself.

If a PE router needs to use OSPF to distribute to a CE router a route that comes from a site outside the CE router’s OSPF domain, the PE router SHOULD present itself to the CE router as an Autonomous System Border Router (ASBR) and SHOULD report such routes as AS-external routes. That is, these PE routers originate Type 5 LSAs reporting the extra-domain routes as AS-external routes. Each such Type 5 LSA MUST contain an OSPF route tag whose value is that of the VPN Route Tag. This tag identifies the route as having come from a PE router. The VPN Route Tag MUST be used to ensure that a Type 5 LSA originated by a PE router is not redistributed through the OSPF area to another PE router.

Note that it says the OSPF should set a route-tag when the implementation doesn’t support setting the down bit in type5 LSAs. Also note in the previous RFC quote that it did note an implementation could set the down bit in type5s if desired. At this point I’ve stopped advertising R7′s loopback directly into OSPF and simply redistributed the loopback. This ensures that the LSA is external.

Usually when an ASBR originates a type5, that type5 remains unchanged in the domain. i.e. the originating router is the same. However according to the quote above, the PE need to originate a new type5 to the attached CE. This we see on R6:

R6#show ip ospf database external 7.7.7.7  adv-router 4.4.4.4

            OSPF Router with ID (6.6.6.6) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA in topology Base with MTID 0
  LS age: 38
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 7.7.7.7 (External Network Number )
  Advertising Router: 4.4.4.4
  LS Seq Number: 80000001
  Checksum: 0x77C7
  Length: 36
  Network Mask: /32
        Metric Type: 2 (Larger than any link state path)
        MTID: 0
        Metric: 20
        Forward Address: 0.0.0.0
        External Route Tag: 3489661028

Notice no down bit. Also note the originator of this type5 is R4 itself. Finally the route has an external route tag of 3489661028

Much like the down bit, if a PE router receives an external LSA with a domain tag that matches it’s own, that LSA will not be used or redistributed
loop ospf31 OSPF as the PE CE routing protocols deep dive – Part 3 of 3 – Loop Prevention

R3#show ip ospf 1 database external 7.7.7.7 adv-router 4.4.4.4

            OSPF Router with ID (10.0.35.3) (Process ID 1)

                Type-5 AS External Link States

  LS age: 744
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 7.7.7.7 (External Network Number )
  Advertising Router: 4.4.4.4
  LS Seq Number: 80000001
  Checksum: 0x77C7
  Length: 36
  Network Mask: /32
        Metric Type: 2 (Larger than any link state path)
        MTID: 0
        Metric: 20
        Forward Address: 0.0.0.0
        External Route Tag: 3489661028

No routing bit set, no redistribution happening.

Route tags – IOS-XR

R6#sh ip ospf database external 7.7.7.7 adv-router 4.4.4.4

            OSPF Router with ID (6.6.6.6) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA in topology Base with MTID 0
  LS age: 11
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 7.7.7.7 (External Network Number )
  Advertising Router: 4.4.4.4
  LS Seq Number: 80000001
  Checksum: 0xEFCE
  Length: 36
  Network Mask: /32
        Metric Type: 2 (Larger than any link state path)
        MTID: 0
        Metric: 20
        Forward Address: 0.0.0.0
        External Route Tag: 3489661028

IOS-XR and IOS have the same behaviour.

IOS – 32bit AS number – Route-tag

The RFC states that when using 16bit AS numbers, the domain tag is automatically derived. When using a 32bit AS number, it should be manually configured. You are able to manually set this even when using a 16bit number with the domain-tag command. You can see above that when using a 16bit number it was automatic. Let’s move to a 32bit number and see what we see.
A quick change of the BGP sessions:

R4#sh run | sec router bgp
router bgp 4294967295
 no bgp default ipv4-unicast
 bgp log-neighbor-changes
 neighbor 2.2.2.2 remote-as 4294967295
 neighbor 2.2.2.2 update-source Loopback0
 neighbor 3.3.3.3 remote-as 4294967295
 neighbor 3.3.3.3 update-source Loopback0

Take a look at the type5 on R6. The domain-tag matches the 32bit AS number directly. This is not 100% confirming to the RFC which states it should be manually set:

R6#sh ip ospf database external 7.7.7.7 adv-router 4.4.4.4

            OSPF Router with ID (6.6.6.6) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA in topology Base with MTID 0
  LS age: 76
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 7.7.7.7 (External Network Number )
  Advertising Router: 4.4.4.4
  LS Seq Number: 80000001
  Checksum: 0x2C48
  Length: 36
  Network Mask: /32
        Metric Type: 2 (Larger than any link state path)
        MTID: 0
        Metric: 20
        Forward Address: 0.0.0.0
        External Route Tag: 4294967295

Of course, R3 will not use that LSA as it’s domain-tag matches.

Considering the domain-tag matches, it stands to reason that any inter-AS VPN using OSPF would be susceptible to routing loops as each SP will have a different domain-tag. One of them could manually set it to match the other.

32bit AS number – Route-tag – IOS-XR

IOS-XR’s 32bit external behaviour is identical to IOS:

R6#sh ip ospf database external 7.7.7.7 adv-router 4.4.4.4

            OSPF Router with ID (6.6.6.6) (Process ID 1)

                Type-5 AS External Link States

  Routing Bit Set on this LSA in topology Base with MTID 0
  LS age: 76
  Options: (No TOS-capability, DC)
  LS Type: AS External Link
  Link State ID: 7.7.7.7 (External Network Number )
  Advertising Router: 4.4.4.4
  LS Seq Number: 80000001
  Checksum: 0xA44F
  Length: 36
  Network Mask: /32
        Metric Type: 2 (Larger than any link state path)
        MTID: 0
        Metric: 20
        Forward Address: 0.0.0.0
        External Route Tag: 4294967295

Once again, IOS and IOS-XR have the same behaviour.

Notes

  • Unlike parts 1 and 2 of this blog, IOS and IOS-XR finally show identical behaviour when it comes to loop prevention.

Embedded traffic capture on Junos and IOS-XE

IOS-XE and Junos both give you the ability to sniff packets directly on the device itself. This is pretty handy for troubleshooting without having to send an engineer to site with a laptop, potentialy with downtime.

Both are very flexible, so I won’t go over every single option possible on both. Rather I’ll just go over a basic capture and view on both platforms. For this post I’ll use a simple topology with an LACP interface between them to show how to get around a limitation or two:
3850 SRX 300x128 Embedded traffic capture on Junos and IOS XE

I’ve enabled OSPF over the aggregated interface.

IOS-XE Setup

I want to view the OSPF hello packets over the port-channel. IOS-XE will not allow you to specify a port-channel interface, but you can specify a range. I’ll simply use a range of interfaces currently in the port-channel. Note that this is done in privileged exec mode and not in configuration mode:

C3850#monitor capture NEW_CAP interface range gi1/0/1 , gi2/0/1 both
C3850#monitor capture NEW_CAP match any
C3850#monitor capture NEW_CAP file location flash:CAP1.pcap

You are able to push the capture through an ACL to match all kinds of particular things. There are a load of options to change if needed. On a 3850 stack, the output needs to go to the current active switches’ flash or USB.

Without configuring any other options, take a look at the defaults used:

C3850#show monitor capture NEW_CAP

Status Information for Capture NEW_CAP
  Target Type:
   Interface: GigabitEthernet1/0/1, Direction: both
   Interface: GigabitEthernet2/0/1, Direction: both
   Status : Inactive
  Filter Details:
    Capture all packets
  Buffer Details:
   Buffer Type: LINEAR (default)
  File Details:
   Associated file name: flash:CAP1.pcap
  Limit Details:
   Number of Packets to capture: 0 (no limit)
   Packet Capture duration: 0 (no limit)
   Packet Size to capture: 0 (no limit)
   Packets per second: 0 (no limit)
   Packet sampling rate: 0 (no sampling)

There are no limits imposed anywhere. If you leave a capture on running to the flash, it could very easily fill the flash. I’ll impose a limit of 60 seconds on this capture to ensure we don’t fill up the flash:

C3850#monitor capture NEW_CAP limit duration 60

IOS-XE Capture

Let’s start the capture:

C3850#monitor capture NEW_CAP start
*Feb 26 08:23:48.854 GMT: %BUFCAP-6-ENABLE: Capture Point NEW_CAP enabled.

I can either run this for 60 seconds, or stop it manually:

C3850#monitor capture NEW_CAP stop
*Feb 26 08:24:19.584 GMT: %BUFCAP-6-DISABLE: Capture Point NEW_CAP disabled.

Very simple.

IOS-XE – View Captures

IOS-XE has a terse output:

C3850#show monitor capture file flash:CAP1.pcap
  1   0.000000     10.0.0.2 -> 224.0.0.5    OSPF Hello Packet
  2   7.868018     10.0.0.2 -> 224.0.0.5    OSPF Hello Packet
  3  15.429030     10.0.0.2 -> 224.0.0.5    OSPF Hello Packet
  4  23.035002     10.0.0.2 -> 224.0.0.5    OSPF Hello Packet

You can also see the entire detail:

C3850#show monitor capture file flash:CAP1.pcap  detailed
Frame 1: 94 bytes on wire (752 bits), 94 bytes captured (752 bits)
    Arrival Time: Feb 26, 2014 08:23:53.939938000 UTC
    Epoch Time: 1393403033.939938000 seconds
    [Time delta from previous captured frame: 0.000000000 seconds]
    [Time delta from previous displayed frame: 0.000000000 seconds]
    [Time since reference or first frame: 0.000000000 seconds]
    Frame Number: 1
    Frame Length: 94 bytes (752 bits)
    Capture Length: 94 bytes (752 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ip:ospf]
Ethernet II, Src: 3c:61:04:d9:73:80 (3c:61:04:d9:73:80), Dst: 01:00:5e:00:00:05 (01:00:5e:00:00:05)
    Destination: 01:00:5e:00:00:05 (01:00:5e:00:00:05)
        Address: 01:00:5e:00:00:05 (01:00:5e:00:00:05)
        .... ...1 .... .... .... .... = IG bit: Group address (multicast/broadcast)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
    Source: 3c:61:04:d9:73:80 (3c:61:04:d9:73:80)
        Address: 3c:61:04:d9:73:80 (3c:61:04:d9:73:80)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
    Type: IP (0x0800)
Internet Protocol, Src: 10.0.0.2 (10.0.0.2), Dst: 224.0.0.5 (224.0.0.5)
    Version: 4
    Header length: 20 bytes
    Differentiated Services Field: 0xc0 (DSCP 0x30: Class Selector 6; ECN: 0x00)
        1100 00.. = Differentiated Services Codepoint: Class Selector 6 (0x30)
        .... ..0. = ECN-Capable Transport (ECT): 0

etc....
etc....

It’s also possible to capture directly to the screen:

C3850#monitor capture NEW_CAP start display detailed
A file by the same capture file name already exists, overwrite?[confirm]
Frame 1: 94 bytes on wire (752 bits), 94 bytes captured (752 bits)
    Arrival Time: Feb 26, 2014 08:31:20.753958000 UTC
    Epoch Time: 1393403480.753958000 seconds
    [Time delta from previous captured frame: 0.000000000 seconds]
    [Time delta from previous displayed frame: 0.000000000 seconds]
    [Time since reference or first frame: 0.000000000 seconds]
    Frame Number: 1
    Frame Length: 94 bytes (752 bits)
    Capture Length: 94 bytes (752 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ip:ospf]
Ethernet II, Src: 3c:61:04:d9:73:80 (3c:61:04:d9:73:80), Dst: 01:00:5e:00:00:05

IOS-XE Notes

  • The capture configuration is not saved in global config, but the config is still there. In order to remove your monitor session you need to explicitly delete it from privileged exec mode:
C3850#no monitor capture NEW_CAP
  • Embedded wireshark can capture data-plane traffic, as well as control-place traffic

Junos Capture

Start up a shell:

darreno@SRX110> start shell user root
Password:

Junos has tcpdump built-in. For this part I’ll write a file to the tmp folder which we can the view later:

root@SRX110% tcpdump -i ae0.0 -w /tmp/CAP2.pcap
Address resolution is ON. Use  to avoid any reverse lookup delay.
Address resolution timeout is 4s.
Listening on ae0.0, capture size 96 bytes

Junos – view captures

We can use tcpdump to view the files we created:

root@SRX110% tcpdump -qn -r /tmp/CAP2.pcap
23:51:58.856149 Out IP truncated-ip - 20 bytes missing! 10.0.0.2 > 224.0.0.5: OSPFv2, Hello, length 60
23:51:58.991515  In IP 10.0.0.1 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:08.531670  In IP 10.0.0.1 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:08.744550 Out IP truncated-ip - 20 bytes missing! 10.0.0.2 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:17.460023 Out IP truncated-ip - 20 bytes missing! 10.0.0.2 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:17.640020  In IP 10.0.0.1 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:25.978974 Out IP truncated-ip - 20 bytes missing! 10.0.0.2 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:26.888403  In IP 10.0.0.1 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:33.517479 Out IP truncated-ip - 20 bytes missing! 10.0.0.2 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:36.858979  In IP 10.0.0.1 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:42.147688 Out IP truncated-ip - 20 bytes missing! 10.0.0.2 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:46.407409  In IP 10.0.0.1 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:49.663809 Out IP truncated-ip - 20 bytes missing! 10.0.0.2 > 224.0.0.5: OSPFv2, Hello, length 60
23:52:55.448971  In IP 10.0.0.1 > 224.0.0.5: OSPFv2, Hello, length 60

If you wanted to quickly see traffic going over an interface without saving a file, you can do it directly from the cli:

darreno@SRX110> monitor traffic interface ae0.0 detail
Address resolution is ON. Use  to avoid any reverse lookup delay.
Address resolution timeout is 4s.
Listening on ae0.0, capture size 1514 bytes

Reverse lookup for 10.0.0.1 failed (check DNS reachability).
Other reverse lookup failures will not be reported.
Use  to avoid reverse lookups on IP addresses.

23:56:24.203372  In IP (tos 0xc0, ttl   1, id 65445, offset 0, flags [none], proto: OSPF (89), length: 80) 10.0.0.1 > 224.0.0.5: OSPFv2, Hello, length 60 [len 48]
        Router-ID 192.168.255.100, Backbone Area, Authentication Type: none (0)
        Options [External, LLS]
          Hello Timer 10s, Dead Timer 40s, Mask 255.255.255.0, Priority 1
          Neighbor List:
            10.0.0.2
          LLS: checksum: 0xfff6, length: 3
            Extended Options (1), length: 4
              Options: 0x00000001 [LSDB resync]
23:56:24.527779 Out IP (tos 0xc0, ttl   1, id 62974, offset 0, flags [none], proto: OSPF (89), length: 80) 10.0.0.2 > 224.0.0.5: OSPFv2, Hello, length 60 [len 48]
        Router-ID 10.0.0.2, Backbone Area, Authentication Type: none (0)
        Options [External, LLS]
          Hello Timer 10s, Dead Timer 40s, Mask 255.255.255.0, Priority 128
          Neighbor List:
            192.168.255.100
          LLS: checksum: 0xfff6, length: 3
            Extended Options (1), length: 4
              Options: 0x00000001 [LSDB resync]

Junos notes

  • Traffic captured on Junos is control-plane traffic only. It cannot capture data-plane traffic