Monday, May 14, 2012

Playing with POSIX pipes in Python

Recently I was faced with an external program that I wanted to call from my script that only writes its output to a file, not to stdout. Faced with having to call this program a lot of times in parallel, I decided to fake up its output files via POSIX FIFO pipes.
Unfortunately the python API around FIFOs is pretty close to the POSIX API, so it feels a bit un-pythonish. The following post illustrates my approach to getting around this limitation.

Workload

In order to simulate my workload, I came up with the following simple script called pipetest.py that takes an output file name and then writes some text into that file.

#!/usr/bin/env python

import sys

def main():
    pipename = sys.argv[1]
    with open(pipename, 'w') as p:
        p.write("Ceci n'est pas une pipe!\n")

if __name__ == "__main__":
    main()

The Code

In my test, this "file" will be a FIFO created by my wrapper code. The implementation of the wrapper code is as follows, I will go over the code in detail further down this post:

#!/usr/bin/env python

import tempfile
import os
from os import path
import shutil
import subprocess

class TemporaryPipe(object):
    def __init__(self, pipename="pipe"):
        self.pipename = pipename
        self.tempdir = None

    def __enter__(self):
        self.tempdir = tempfile.mkdtemp()
        pipe_path = path.join(self.tempdir, self.pipename)
        os.mkfifo(pipe_path)
        return pipe_path

    def __exit__(self, type, value, traceback):
        if self.tempdir is not None:
            shutil.rmtree(self.tempdir)

def call_helper():
    with TemporaryPipe() as p:
        script = "./pipetest.py"
        subprocess.Popen(script + " " + p, shell=True)
        with open(p, 'r') as r:
            text = r.read()
        return text.strip()

def main():
        call_helper()

if __name__ == "__main__":
    main()

Code in Detail

So let's look at the code in more detail. The code I'm using relies on a bunch of libs from the python standard library, and is working with Python 2.6 and up.

tempfile is used to get a temporary directory for me to create the FIFO in.
os has the os.mkfifo() call.
os.path handles the path crunching required.
shutil is used to remove the temporary directory after use.
subprocess is used to run the workload script.

TemporaryPipe class

Next comes the nifty part, a context manager object handling the creation and removal of the temporary FIFO pipe. Let's look at the class in detail.

class TemporaryPipe(object):
    def __init__(self, pipename="pipe"):
        self.pipename = pipename
        self.tempdir = None

The class definition and the constructor don't really hide anything interesting, though it's worth noting that self.tempdir is set to None. That will make the clean-up easier further down.

enter

    def __enter__(self):
        self.tempdir = tempfile.mkdtemp()
        pipe_path = path.join(self.tempdir, self.pipename)
        os.mkfifo(pipe_path)
        return pipe_path

The __enter__(self) function is the set-up code for the context manager. Here, a temporary directory is created. Afterwards, os.mkfifo() creates the FIFO. Finally, the pipe's path is returned.

exit

    def __exit__(self, type, value, traceback):
        if self.tempdir is not None:
            shutil.rmtree(self.tempdir)

The __exit__(self, type, value, traceback) function is always called when the context manager's block is exited. Thus, it's the ideal place to run the clean-up, in our case removing the temporary directory and the pipe contained within it. shutil.rmtree() takes care of this just fine. If mkdtemp() failed, we don't have to bother, of course. Our clean-up doesn't require any extra knowledge of the things we're cleaning up, so we're free to ignore all those parameters.

The call_helper Function

def call_helper():
    with TemporaryPipe() as p:
        script = "./pipetest.py"
        subprocess.Popen(script + " " + p, shell=True)
        with open(p, 'r') as r:
            text = r.read()
        return text.strip()

Because TemporaryPipe is a context manager, it's useable from a with statement. This means that in the block inside the with TemporaryPipe() as p block, there is a temporary directory containing a FIFO pipe. Because __enter__() returns the pipe's path, that will be assigned to p within the block.
subprocess.Popen() is now used to run the workload script, going via a shell to evaluate the hashtag. This probably isn't the smartest idea performance-wise, but this is proof-of-concept code after all.
After the workload script was run, another with statement opens a new block using the pipe's path, opening the FIFO for reading. The text is read out and the newline stripped. Now, the return statement returns the read text, and also causes the pipe's context manager to call the __exit__() function to clean up.

Conclusions

I'm pretty content with the way the call_helper() function reads. The complexity of setting up and then cleaning up the FIFO is hidden away in the TemporaryPipe class. I spent a bit of time coming up with this, so I thought I'd share this solution with other people. Now I just need to add this to my utility library and write tests for it.

Saturday, March 31, 2012

Samba4 DNS sprint, day 5 summary

Another long and only partially successful day is behind me, and my allocated time for this sprint is over. I said "partially successful", because I did not manage to get GSS-TSIG working. This is mostly due to the fact that I don't understand how to hook it up to GENSEC/gss on the Samba side. The API is a bit confusing to the uninitiated. What I did get done was to get to a point where incoming TKEY messages are parsed and checked, and pretty much handled correctly. We currently bail out of there with a BADKEY error, pretending the client's key didn't work. If someone with a reasonable grasp of GENSEC would explain what I need to do there to get the GSSAPI blob from the client authenticated, I would expect GSS-TSIG is very, very close.

Because it's the end of the week let me take a look at the high and low points of this sprint over the week:

High point: On Tuesday morning, I finally got forwarding sorted out. Ever since Tuesday, all DNS requests on my dev machine were handled by my local samba server.
Low point: I wasted most of Tuesday trying to debug my HMAC-MD5 signing code. Debugging crypto is hard, because the only debug tool available is "stare at the code and think very hard". This might be the weapon of choice of the kernel community, but certainly not my preferred way of doing things.
High point: On Wednesday morning, I managed to fix signing of TSIG requests.
Low point: This got me work on TSIG some more instead of moving on to GSS-TSIG, and ultimately failed because signing of TSIG replies doesn't work correctly yet, another day wasted.
Low point: After reading up on TKEY and GSS-TSIG, I realized that I didn't really understand what I had to do in Samba to get this sorted out. This ended up being a major stumbling block, in fact I'm still stuck there.
High point: During my tries to find a useful test for TKEY, I set up a Win7 client for my domain, and after a tiny fix to get PTR records handled in the update code, that machine would correctly register forward and reverse zones (without crypto, but also without complaining), and was perfectly happy using samba's DNS service for it's needs.

So to sum up, forwarding turned out to be a neater feature than I initially expected it to be, and allows me to run samba as my main name server for the local network. On the negative side, all that fancy crypto stuff isn't working yet. I do feel that none of these is really far off anymore. Maybe another pair or two of eyes would help there. I've updated the Samba Wiki DNS page to reflect the current status.

Friday, March 30, 2012

Samba DNS sprint, day 4 summary.

I'm still a but stuck with TKEY/TSIG, unfortunately. While looking at the GSS-TSIG implementation we have in libaddns, I realized that I could simplify my time handling. That ended up fixing my TSIG issues from yesterday. That is, I can now correctly generate the client/request side of a HMAC-MD5 TSIG. The server side still seems broken, at least I can't get dig to accept my reply signature, and if I query bind the server reply differs from what I would calculate fore it. Oh well.

I've looked at plain TKEY, but for now it doesn't really seem worth the effort. So I've decided to work on GSS-TSIG directly instead. I don't really know how to deal with the Gensec side of this, though, so it's a bit hard to keep the momentum going for this. I'm beginning to fear that I won't get this implemented this week. Not because any part of it was particularly hard, but because there's tons of little things that all take a couple of minutes. And of course sitting in front of the computer alone lone ranger style isn't the most fun way to develop software.

For tomorrow, I hope to get a bit more done than today. I'll be working on a little gss-tsig test utility based on libaddns that I can use to test my server implementation. That should at least allow me to figure out what's going on at specific steps. I still might need some help on the Gensec side.

Thursday, March 29, 2012

Samba DNS sprint, day 3 summary

Some progress on the TSIG front, but I'm stuck with the exact signing method for a packet. For some reason dig and I disagree on what the HMAC-MD5 of a specific query should be. The RFC is a bit vague, and the BIND code of that area seems to be in assembler. (Ok, it's C, but their coding conventions differ so much from ours that I probably have to spend a week getting my brain to adjust to that)

So I'm not continuing on hmac-md5 support, but will instead look at GSS-TSIG directly today. That's the must-have feature, and the whole week would be wasted if I didn't get that in.

TL;DR: HMAC-MD5-TSIG stupid, working on GSS-TSIG now.

Tuesday, March 27, 2012

Samba4 DNS sprint, day 2 summary

I actually spent my time working out some smaller kinks in the DNS server that I ran into while using it as the only DNS server on my development machine. I also started with restructuring my dns processing code a bit so I can handle TSIGs in a sensible way. I've got dig set up to send TSIGs with an all-0 hmac key, so for tomorrow I should be ready to go.

Oh, and I pushed my dns forwarder work to master, and it passed autobuild. Life is good.

Samba4 DNS sprint, day 2

Ok, so I cheated a bit and kept poking at the DNS forwarder code a bit more yesterday after posting my summary. I didn't quite get anywhere final before I went to bed, but this morning, while waiting for my coffee to run through the machine, I got this thing set up. I now can forward requests the internal server doesn't feel responsible for to another DNS server and get the reply back to the client. :) It's not quite production-ready code, but it sure works good enough to switch my DNS settings on my development machine to use Samba DNS.

That makes today TSIG-day. Time to re-read RFC2845 and see if I can get this implemented in my test client.

Monday, March 26, 2012

Samba4 DNS sprint, day 1 summary

Ok, of course this didn't go as planned. It took longer than expected to figure out how to best test my DNS library, which by itself seems to work ok but also only is a thin wrapper around tdgram, so it doesn't do anything fancy yet.

I played with getting some code into the server, but I think I'm not quite doing the right thing there yet. I've set myself a deadline until tomorrow 11:00, if I haven't got it by then, I'm back to TSIG et al.

All in all, I notice that with all the python programming I've been doing recently, my C-fu has rusted a bit. I hope today will prove to be the WD-40 I needed to get going again. :)

Oh well, enough for today, more Samba DNS work will come tomorrow.

Samba4 DNS sprint, day 1

Samba has it's own small DNS server built in, but it's still lacking a couple of very nice-to-have features. This week, I'll be trying to get as many of those in as possible. There's two big parts here. One is getting forwarder support, so we can query other name servers on behalf of our clients. The other big item is getting signed updates to work so windows clients can sign their dynamic update requests. My battle plan for this week is:

Have a quick stab at a really simple forwarder library, but fall back to running dnsmasq with forwarding set up if I don't get anywhere until early afternoon today
Implement shared secret TSIG updates, to get the TSIG logic sorted out
Implement TKEY exchanges as specified in RFC2930, to set up the TKEY handling infrastructure
Make GSS-TSIG work as a possible signing method, so Windows is happy finally
More work on the forwarder library if needed/I have the time

Let's see how far I'll get, I'll post another update with what I accomplished today in the evening.

Friday, March 16, 2012

Running Samba's autobuild.py

Samba has a lot of tests, and we like to run them often. In order to easily do that, we've got a script that checks out a bunch of repositories and runs all tests in them, in parallel and independent of each other. It's living in the source tree at scripts/autobuild.py. Here's my notes for running autobuild.py on a local machine. First, set up an in-memory file system. autobuild.py and the tests run by it touch a lot of files, and not running these tests on a spinning disk will speed things up a lot.

# create the memdisk location
mkdir /memdisk

# default size is half your ram, use -o size=SIZE
# to change that if needed
mount -t tmpfs tmpfs /memdisk

# now create an image file, samba's tests don't like plain tmpfs
# Needs to be bigger than 3 gig
dd if=/dev/zero of=/memdisk/build.img bs=1MiB count=4000
losetup /dev/loop0 /memdisk/build.img


# format as ext2, no need to do journalling
# it's gone when the machine fails anyway
mkfs.ext2 /dev/loop0

# mount
mkdir /memdisk/kai
mount /dev/loop0 /memdisk/kai
chown -R kai:kai /memdisk/kai

And now, I can just run ./script/autobuild.py and get a coffee while all the tests are run.