Python paths and Cron logging

I created two new twitter accounts yesterday and the amount of followers in such a short time is great to see. Feel free to follow them here – @bgp4_table and @bgp6_table

The accounts get updated through Python, and that Python script is run via a cron job once every six hours.

I noticed that when I ran my script manually, it worked fine. When the cronjob ran it, nothing happened. As there is no console log it made me wonder what the issue was.

Cron log

The first thing I needed to do was configure cron to log it’s output. I’ve done it like so in my crontab:

0 0,6,12,18 * * * /home/scripts/tweet.py > /home/scripts/tweet.log 2>&1

This directs all 1 and 2 output to the log file I created. In case you not aware, standard stream 0 is input, stream 1 is output, and stream 2 is errors. The commands above ensure I’m logging both output, if any, and errors, if any.

Python paths

On the next cron run, my log was created and it was plain to see:

$ less tweet.log
Traceback (most recent call last):
  File "/home/scripts/tweet.py", line 10, in 
    with open("v4_count") as f:
IOError: [Errno 2] No such file or directory: 'v4_count'

As part of my script, I save the last values in text files located in the same path as the script. When the script runs again, it reads the last value, get the new value, then works out a delta to display. I then write the latest value into that file for the next run. This is an example of one of those reads:

# Pull last delta
with open("v4_count") as f:
    old_v4 = int(f.readline())

This ran fine if I ran the script manually, but I was always running the script from within the same folder. This meant that python was finding and opening the file in the same folder. With the cronjob, it was getting called from somwehere else where v4_count did not exist.

The simple fix for this was to change all references to the full path:

# Pull last delta
with open("/home/scripts/v4_count") as f:
    old_v4 = int(f.readline())

This time on the next run, no more problems :)

Using bird to pull global BGP route counts

For an electronics project I’m working on I wanted a way to check the current global routing table every five minutes for both IPv4 and IPv6. I did not want to log into anyone else’s router or looking glass as checking every 5 minutes may be considered abuse.

So I thought to spin up a bird routing instance. I just wanted to receive the routes, not actually install them into the kernel on my linux box. From there I’d be able to check the table size sent over.

Nat Morris helped me out by sending a full tables over a multihop BGP session.

Installing bird is trivial. In order to ensure I’m only running BGP and not installing into the kernel, my configuration looks pretty simple.

/etc/bird.conf:

log syslog all;

router id x.x.x.x;

protocol device { }

protocol bgp {
        local as xxxxx;
        neighbor x.x.x.x as xxxxx;
        multihop;
        password "xxxxx";
}

/etc/bird6.conf:

log syslog all;

router id x.x.x.x;

protocol device { }

protocol bgp {
        local as xxxxx;
        neighbor x:x:x:x:x::x as xxxxx;
        source address x:x:x:x:x:x::x;
        multihop;
        password "xxxxx";
}

In order to get the figures I need, I’d usually have to log into the console of the daemon like so:

$ birdc
BIRD 1.3.7 ready.
bird> show protocols all bgp1
name     proto    table    state  since       info
bgp1     BGP      master   up     19:06       Established
  Preference:     100
  Input filter:   ACCEPT
  Output filter:  REJECT
  Routes:         511014 imported, 0 exported, 511014 preferred
  Route change stats:     received   rejected   filtered    ignored   accepted
    Import updates:         523721          0          0        826     522895
    Import withdraws:         1313          0        ---          0       1313
    Export updates:         522895     522895          0        ---          0
    Export withdraws:         1313        ---        ---        ---          0
  BGP state:          Established
    Neighbor address: x.x.x.x
    Neighbor AS:      xxxxx
    Neighbor ID:      x.x.x.x
    Neighbor caps:    refresh AS4
    Session:          external multihop AS4
    Source address:   x.x.x.x
    Hold timer:       142/180
    Keepalive timer:  33/60

I could get a script to log in and get the required information via regular expressions, but there has to be an easier way. Turns out you can push a command directly to bird without logging into it first:

$birdc 'show protocols all bgp1'
BIRD 1.3.7 ready.
name     proto    table    state  since       info
bgp1     BGP      master   up     19:06       Established
  Preference:     100
  Input filter:   ACCEPT
  Output filter:  REJECT
  Routes:         511025 imported, 0 exported, 511025 preferred
  Route change stats:     received   rejected   filtered    ignored   accepted
    Import updates:         523924          0          0        839     523085
    Import withdraws:         1329          0        ---          0       1329
    Export updates:         523085     523085          0        ---          0
    Export withdraws:         1329        ---        ---        ---          0
  BGP state:          Established
    Neighbor address: x.x.x.x
    Neighbor AS:      xxxxx
    Neighbor ID:      x.x.x.x.x
    Neighbor caps:    refresh AS4
    Session:          external multihop AS4
    Source address:   x.x.x.x
    Hold timer:       161/180
    Keepalive timer:  24/60

Still too much information, but we can use grep!

$birdc 'show protocols all bgp1' | grep 'Routes'
  Routes:         510977 imported, 0 exported, 510977 preferred

Better, but those fields are nicely tabbed so awk to the rescue! I’d like to get that route count. Easily done.

$birdc 'show protocols all bgp1' | grep 'Routes' | awk {'print $2'}
510976

I can now cron a script that will pull those values once every 5 minutes and generate an XML file which you can see right here.

Feel free to query that page and use it for your own projects. Just be aware there is NO SLA on it :)

When and when not to multithread

At the end of my last post on Python multithreading, I said my example was not the best. Let me expand some more on this.

While testing code in the previous post, I noticed that certain code was slower when multiple threads were running. Also these threads are not tied to a CPU. If we were talking about a bigger applications in which we wanted to ensure multiple threads were on different CPUs, you are in fact looking for multiprocessing.

Consider the following code. It simple counts to 99 999, doubles the number, then prints this to the screen. At first I’ll do this as a single thread app then multithread and time them.

Single-thread

#!/usr/bin/python

for i in range (100000):
    i *= 2
    print i

Multi-thread

#!/usr/bin/python

import threading
lock = threading.Lock()

def thread_test(i):
    i *= 2
    with lock:
        print i

threads = []
for i in range (100000):
    t = threading.Thread(target = thread_test, args = (i,))
    threads.append(t)
    t.start()

I’ll now time and run the command. I’ll run each command three times and take the average of all three:

time ./single.py

The single thread is able to do this in 0.411 seconds, while the multithreaded app takes a full 16.409 seconds.

Now I’ll do a test in which multithreading will make a big difference. I have a list of 500 random urls. I want to log into each, then get them to display the page contents. Not all urls respond, and I’ve also given a three second timeout to fetching any page.

The single thread app is coded like so:

#!/usr/bin/python

import urllib2

with open("urls.txt", "r") as f:
    urls = f.readlines()

for url in urls:
    request = urllib2.Request(url)
    try:
        response = urllib2.urlopen(request, timeout = 3)
        page = response.read()
        print page
    except:
        print "Unable to download"

This takes a full 11 minutes and 40 seconds to fully run.

Converted to multithread:

#!/usr/bin/python
 
import urllib2
import threading

lock = threading.Lock()

def thread_test(url):
    try:
        response = urllib2.urlopen(url, timeout = 3)
        page = response.read()
        with lock:
            print page
    except:
        with lock:
            print "Unable to download"

with open("urls.txt", "r") as f:
   urls = f.readlines()
threads = []

for url in urls:
    request = urllib2.Request(url)
    t = threading.Thread(target = thread_test, args = (request,))
    threads.append(t)
    t.start()

This time the average over 3 runs is only 1 minute and 40 seconds.

I am however still locking output to the screen. This may be bad practice, but let’s assume I don’t really care about visible output. Maybe I just want to throw some commands somewhere, or something simple like ping. If I didn’t lock before printing, how quickly could this actually run?

#!/usr/bin/python
 
import urllib2
import threading

lock = threading.Lock()

def thread_test(url):
    try:
        response = urllib2.urlopen(url, timeout = 3)
        page = response.read()
    except:
        pass

with open("urls.txt", "r") as f:
   urls = f.readlines()
threads = []

for url in urls:
    request = urllib2.Request(url)
    t = threading.Thread(target = thread_test, args = (request,))
    threads.append(t)
    t.start()

This completes in 1 minute and 13 seconds. Not as much as I hoped for. But it does mean one thing. Python is not running ALL the threads at exactly the same time. If that was the case, the max run time would be just over three seconds as that’s what the timeout is.

I’ll load up Wireshark and run the test again. I should see how many threads are sending HTTP GETs at the same time. When I start the threads, I can see 26 threads all starting within a second of each other. Only a full 7 seconds later do others start:
Screen Shot 2014-12-14 at 13.39.51

After that, I see more threads being added as others end. The timing seems random later as each page has a different response time.
Screen Shot 2014-12-14 at 13.41.20

This seems to be an OS imposed limit.

Conclusions

  • Multithreading in Python has certain benefits only in specific cases.
  • Mainly if you are requesting data from many different sources. No need to query them one at a time.
  • Python’s initial single thread is rather efficient all by itself.

I’m certainly not going to rewrite all my current code to use multithreading. Going back to my original OSPF checker, it certainly would be good to check pull information off multiple devices at the same time, but the rest of the app I’d still keep as a single thread.

Basic Python Multithreading

The first ‘proper’ Python app I made logged onto a list of devices and pulled out OSPF state. This worked perfectly fine. The app correctly works out whether it can log into a device or not, and waits a few seconds to ensure a device actually responds.

The issue is that if I have a list of say 1000 devices, and 500 of them don’t respond, the amount of time you need to wait rapidly increases as it looks at each one in turn. Would it not be better for the app to be able to log into multiple devices at the same time in parallel? This would drastically reduce the runtime.

Basic Threading

Consider the following code:

#!/usr/bin/python

import threading

def thread_test():
    print "I am a thread"
    return

threads = []
for i in range(4):
    t = threading.Thread(target = thread_test)
    threads.append(t)
    t.start()

A module is defined called thread_test. I then spawn four threads, each of which run the module. I should therefore see four lines printed:

$ ./thread.py
I am a thread
I am a thread
I am a thread
I am a thread

Of course getting them all to do exactly the same thing is a bit boring. I may have a list of items I want to print. Let’s pass each item as an argument and print them out:

#!/usr/bin/python

import threading

list_of_items = ["cat", "banana", "house", "phone"]

def thread_test(item):
    print "I am a " + item
    return

threads = []
for word in list_of_items:
    t = threading.Thread(target = thread_test, args = (word,))
    threads.append(t)
    t.start()
$ ./thread.py
I am a cat
I am a banana
I am a house
I am a phone

If you’ve run this code, you may notice that sometimes your output gets a bit garbled:

$ ./thread.py
I am a catI am a banana

I am a house
I am a phone

$ ./thread.py
I am a cat
 I am a banana
I am a house
I am a phone

All four threads are trying to write to the screen at the same time. If outputting to the screen, or writing to a file, this output can look rather messy. Especially as the device and thread count goes up.

Locks

I can use locks to prevent this. Each thread can go do it’s business, but if I need to write to the screen or write to a file, I ensure only a single thread can do this at a time. As an example I’ll iterate through a list of 100. All those threads will create their data in memory at pretty much the same time, but I’ll ensure only one at a time can print and write to a file. I’ll also ensure that the application closes the file only after all threads are completed.

#!/usr/bin/python

import threading

lock = threading.Lock()

def thread_test(num):
    phrase = "I am number " + str(num)
    lock.acquire()
    print phrase
    f.write(phrase + "\n")
    lock.release()

threads = []
f = open("text.txt", 'w')
for i in range (100):
    t = threading.Thread(target = thread_test, args = (i,))
    threads.append(t)
    t.start()

while threading.activeCount() > 1:
    pass
else:
    f.close()

Any code between the locks acquiring and releasing can only be done one at a time. The example above doesn’t show a great example, but the action of getting data and waiting from a remote device can take a few seconds. If that can all be done at the same time, then results written once at a time to a file, it would speed things up immensely.

Update – 15/12/14

Ben Cardy below mentioned a great shortcut that most viewers might miss if not reading all the comments. For that reason I’ll put it up here. My code above acquires and releases a lock when needed. There is a simpler way to do this. If you code with lock, any code indented after will essentially be wrapped in lock codes. This is nice as you don’t have to remember to release the lock. Another benefit is that the with code will release the lock even if the thread throws an exception.

The last code above could be rewritten like so:

#!/usr/bin/python
 
import threading
 
lock = threading.Lock()
 
def thread_test(num):
    phrase = "I am number " + str(num)
    with lock:
        print phrase
        f.write(phrase + "\n")

 
threads = []
f = open("text.txt", 'w')
for i in range (100):
    t = threading.Thread(target = thread_test, args = (i,))
    threads.append(t)
    t.start()
 
while threading.activeCount() > 1:
    pass
else:
    f.close()

DHCP Snooping – Filter those broadcasts!

I had a specific requirement recently and I wanted to test it’s behaviour. In particular the feature is DHCP snooping. Let’s quickly go over the DHCP process at a high level to see how it works:

DHCP

Let’s take the following simple diagram to show what’s going on. We have a switch with two hosts connected. We also have a DHCP server. I’m using generic names as I’ll be testing this on different switches. Assume all devices are in the same vlan.
dhcp_snoop1

Host1 has just booted and needs an IP address. It’ll send a DHCP DISCOVER packet which is a broadcast. This broadcast gets sent to all ports in the vlan:
dhcp_snoop2

The DHCP server will then send an DHCP OFFER to host 1. It does this via unicast using the destination MAC as the layer 2 destination:
dhcp_snoop3

Host1 then send a DHCP REQUEST via broadcast. Why broadcast? This is because it may have received offers from multiple DHCP servers and is essentially telling all of them that they are accepting an offer some one of them.
dhcp_snoop4

Finally, the DHCP server acknowledges that Host1 has accepted it’s offered IP with a DHCP ack. This is unicast again:
dhcp_snoop5

Now, depending on bootp options, the offer and/or ack might actually be broadcast. The behaviour is also slightly different when using DHCP helpers, but we are mainly concerned with the DHCPDISCOVER and DHCPREQUEST packets which are always broadcast.

DHCP Snooping

In the above example, there was nothing stopping Host 2 from providing IP addresses via DHCP. This might either be malicious activity, or merely someone doing something wrong and either configuring a device wrong, or plugging in a device which should not be there.

DHCP snooping was created to prevent this from happening. DHCP’s main concern is making sure that DHCPOFFERS only come in via trusted ports. In our example port 1 connected to the DHCP server should be a trusted port. Port2 and port 3 connected to Host 1 and Host 2 respectively should never have DHCPOFFER packets on ingress. But here is the kicker. A DHCPOFFER is in response to an event. That event is a DHCPDISCOVER. That DHCPDISCOVER is a broadcast.

It stands to reason that if a DHCPOFFER cannot ever ingress port2 and port3, those ports should never have DHCPDISCOVER packets replicated to them to begin with, regardless of whether those packets are broadcast. All other broadcasts should go through, but these specific DHCP ones should not.

So is this what we actually see in the real world? I’ll test this on the devices I have available to see what behaviour I see.

Cisco Catalyst IOS

My config is as follows:

ip dhcp snooping vlan 1-4094                                                                           
ip dhcp snooping 

interface FastEthernet0/1                                                                              
 switchport access vlan 10                                                                             
 switchport mode access                                                                                
!                                                                                                      
interface FastEthernet0/2                                                                              
 switchport access vlan 10                                                                             
 switchport mode access        
!
interface FastEthernet0/24                                                                             
 switchport access vlan 10                                                                             
 switchport mode access                                                                                
 ip dhcp snooping trust     

DHCP snooping enabled with fa0/24 being the trusted port going towards my server.

I have host1 and host2 connected with the following MAC addresses:

  • 78:2b:cb:e4:e3:88
  • 00:26:5a:ef:85:33

I’ll now listen on fa0/24. I should see both DHCPDISCOVER broadcasts coming though:

$ sudo tcpdump -i eth1 -n port 67 and port 68
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
11:45:45.204815 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 78:2b:cb:e4:e3:88, length 300
11:45:48.733826 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:26:5a:ef:85:33, length 300

That’s exactly what I see.

If I now move the capture point over to fa0/2, I hope to see no broadcasts at all. If not, this would mean the device is not replicating those broadcasts out untrusted ports:

$ sudo tcpdump -i eth1 -n port 67 and port 68
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes

Silence. That’s just what I wanted to see.

Juniper EX

Config is as follows:

root@ex2200-12> show configuration interfaces ge-0/0/2           
unit 0 {
    family ethernet-switching {
        port-mode access;
        vlan {
            members vlan_test;
        }
    }
}

{master:0}
root@ex2200-12> show configuration interfaces ge-0/0/3           
unit 0 {
    family ethernet-switching {
        port-mode access;
        vlan {
            members vlan_test;
        }
    }
}

{master:0}
root@ex2200-12> show configuration interfaces ge-0/0/4           
unit 0 {
    family ethernet-switching {
        port-mode access;
        vlan {
            members vlan_test;
        }
    }
}


root@ex2200-12> show configuration ethernet-switching-options    
secure-access-port {
    interface ge-0/0/4.0 {
        dhcp-trusted;
    }
    vlan all {
        examine-dhcp;
    }
}

ge-0/0/4 is now my trusted DHCP server port. If I listen on that port, I should see both devices broadcasts:

$ sudo tcpdump -i eth1 -n port 67 and port 68
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes

11:58:02.539119 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 78:2b:cb:e4:e3:88, length 300
11:58:05.809947 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:26:5a:ef:85:33, length 300

What about when listening on the untrusted port?

$ sudo tcpdump -i eth1 -n port 67 and port 68
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
11:58:55.342651 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 78:2b:cb:e4:e3:88, length 300

I hear the broadcast come through. There is only one MAC as I’ve had to disconnect the host in order to listen via wireshark.

Conclusions

  • IOS switches filter the initial DHCPDISCOVER broadcast packets. Junos switches do not.
  • Both devices DO drop DHCPOFFER packets coming in on untrusted ports.
  • Cisco is a bit more intelligent in it’s behaviour.

Not filtering the broadcast initially doesn’t break DHCP snooping. But it’s completely unnecessary. Why send a request out a port that you would filter a reply? I’ve seen switches reload and suddenly all devices on the switch try to get their IPs back. All devices receive all these broadcasts when only the trusted port should receive it. Filtering ensures less broadcasts on the network and also prevents badly configured devices from replying to a packet it should never have received.