Monday, March 29, 2010

Route all traffic from certain programs through a PPTP VPN tunnel on FreeBSD

The problem

You’d like to route certain traffic over a VPN tunnel, but you only know which program is generating the traffic, not which ports it will be using or which IP ranges it will be connecting to. The only way to crack this nut seems to be to set the default route to the VPN tunnel, but this will send all traffic through the tunnel, which may be a problem if the tunnel is slow and laggy, e.g. if the endpoint happens to be the IPREDator service in Sweden. The common solution seems to be “run the program in a virtual machine,” but that seems like overkill. What to do?

The solution

If you’re on FreeBSD, you’re in luck: since 7.1, FreeBSD has supported multiple routing tables (FIBs). You can set up a secondary routing table that goes through the VPN tunnel by default, and bind any program you wish to that specific routing table.

Kernel support

Your kernel must be rebuilt with support for at least 2 FIBs; the default kernel only has support for one. Add

options    ROUTETABLES=2

to your kernel configuration and rebuild.

MPD configuration

The PPP client of choice is the netgraph-based Multilink PPP Daemon (mpd5 in ports). The relevant section of mpd.conf looks like this:

pptp_client:
        create bundle static B1
        set iface up-script /usr/local/etc/mpd5/pptp-up.sh
        set iface down-script /usr/local/etc/mpd5/pptp-down.sh
        set ipcp ranges 0.0.0.0/0 0.0.0.0/0

        set bundle enable compression
        set ccp yes mppc
        set mppc yes e40
        set mppc yes e128
        set bundle enable crypt-reqd
        set mppc yes stateless

        create link static L1 pptp
        set link action bundle B1
        set auth authname ***********
        set auth password ***********
        set link max-redial 0
        set link mtu 1460
        set link keep-alive 20 75
        set pptp peer vpn.ipredator.se
        open

Note the lack of a route command. The IPREDator folks have set up their VPN service in a slightly problematic way: the endpoint of the tunnel is also the remote router. The best route to this network node is therefore the direct link over the PPP tunnel, and the kernel will attempt to send the PPP packets over the tunnel itself, like a truck trying to drive up its own tailpipe. Any time this happens, MPD will kill the link. To avoid this condition, you’ll have to set up the routes manually in the up- and down-scripts, which MPD runs whenever the link is brought up or torn down:

interface=$1
proto=$2
tun_ip=$3
tun_endpoint=$4

tun_iface=ng0
eth_iface=em0

eth_gateway=`route get default -iface em0 | awk '/gateway/ { print $2}'`

route delete $tun_endpoint
route add $tun_endpoint $eth_gateway

# make this the default route for the secondary FIB
setfib -1 route flush
setfib -1 route add $tun_endpoint $eth_gateway
setfib -1 route add default $tun_endpoint

The trick is to delete the direct link and force traffic to the tunnel endpoint to run over the physical ethernet interface. The last part of the script does the same trick with the secondary FIB and then adds a default route pointing to the tunnel.

In the down script, you just have to delete the route to the tunnel endpoint. I also flush the routing table in the secondary FIB, since I want traffic to halt completely if the tunnel isn’t available.

interface=$1
proto=$2
tun_ip=$3
tun_endpoint=$4

eth_iface=em0

route delete $tun_endpoint

# kill all routing on the secondary FIB
setfib -1 route flush

Use

You can run a program in the tunneled environment via setfib -1 $cmd. Hooray!

Sunday, March 28, 2010

Rebroadcast RealAudio streams as mp3 via HTTP

The problem

iTunes + Remote app for iPod touch + AirPort Express = best kitchen radio ever. Most of my favorite public radio stations offer MP3 streams over HTTP. My wife, however, has a soft spot in her heart for certain stations that insist on broadcasting RealAudio over RTSP. How to get these to stream to the Airport Express in the kitchen?

The solution

The ingredients:

  • Mplayer built with support for RealAudio/COOK (an exercise for the reader)
  • LAME mp3 encoder
  • Python!

The rudimentary solution

Mplayer, when built properly, can stream and decode RealAudio over RTSP. In the simplest form, this looks something like:

url="rtsp://stream2.rbb-online.de/encoder/antenne-live.ra"
mplayer -cache 48 -vo null -vc null -ao pcm:waveheader -ao pcm:file=foo.wav "${url}"

This is nice, but ideally we’d like to transcode a stream on the fly. Named pipes to the rescue!

url="rtsp://stream2.rbb-online.de/encoder/antenne-live.ra"
mkfifo dump.pipe
mkfifo transcode.pipe
mplayer -cache 48 -vo null -vc null -ao pcm:waveheader -ao pcm:file=dump.pipe "${url}" &
lame -r dump.pipe transcode.pipe

The transcoded MP3 can be read from transcode.pipe. Now, to wrap the whole kit an caboodle up in Python.

The full solution

This all gets wrapped up in a Python HTTP request handler based on SimpleHTTPRequestHandler. The handler forks instances of mplayer and lame and creates the named pipes on the fly. When the client closes the stream, the child processes are cleaned up automatically.

from subprocess import *
from SimpleHTTPServer import SimpleHTTPRequestHandler
from BaseHTTPServer import HTTPServer
import os,tempfile,signal,urlparse

class StreamHandler(SimpleHTTPRequestHandler):
  dumper = None
  default_url = "rtsp://stream2.rbb-online.de/encoder/antenne-live.ra"

  def parse_qs(self,query_string):
    values = [pair.split('=') for pair in query_string.split('&')]
    return dict(values)

  def do_GET(self):
    """Serve a GET request."""
    f = self.send_head(True)
      if f:
      try:
        self.copyfile(f, self.wfile)
        f.close()
      except:
        self.cleanup()
    self.cleanup()

  def __del__(self):
    self.cleanup()

  def cleanup(self):
    if self.dumper is not None:
      self.log_message('Stream terminated, cleaning up...')
      # mplayer forks a child, so we should go all Agamemnon via SIGINT 
      os.kill(self.dumper.pid,signal.SIGINT)
      os.kill(self.transcoder.pid,signal.SIGINT)
      # reap dead children
      self.dumper.wait()
      self.transcoder.wait()
      # remove tempfiles
      os.remove(self.dump_pipe)
      os.remove(self.trans_pipe)
      os.rmdir(self.tmpdir)
      self.fnull.close()
      # unset instance variables
      self.dumper = None
      self.transcoder = None
      self.dump_pipe = None
      self.trans_pipe = None
      self.tmpdir = None

  def send_head(self,is_get=False):
    if is_get:
      f = open(self.make_stream(),'rb')
    else:
      f = open(os.devnull,'r')
      self.send_response(200)
      self.send_header("Content-type", "application/octet-stream")
      self.end_headers()
      return f

  def make_stream(self):
    qs = urlparse.urlparse(self.path).query
    if len(qs) > 0:
      query = self.parse_qs(qs)
    else:
      query = dict()
    url = query.get('url',self.default_url)
    self.log_message('Making a stream for %s',url)
    self.tmpdir = tempfile.mkdtemp()
    self.dump_pipe = os.path.join(self.tmpdir,'dump.pipe')
    self.trans_pipe = os.path.join(self.tmpdir,'transcode.pipe')
    os.mkfifo(self.dump_pipe)
    os.mkfifo(self.trans_pipe)
    self.fnull = open(os.devnull, 'w')
    self.dumper = Popen(['mplayer','-cache','64','-vc','null','-vo','null','-ao','pcm:file=%s' % self.dump_pipe,url],stdout=self.fnull,stderr=self.fnull)
    self.transcoder = Popen(['lame','-r','--preset','fast','standard',self.dump_pipe,self.trans_pipe],stdout=self.fnull,stderr=self.fnull)
    return self.trans_pipe

def run():
  httpd = HTTPServer(('',8001), StreamHandler)
  sa = httpd.socket.getsockname()
  print "Serving HTTP on", sa[0], "port", sa[1], "..."
  httpd.serve_forever()

if __name__ == "__main__":
  run()

To add a transcoded radio stream, make an M3U playlist like this:

#EXTM3U
#EXTINF:0,Antenne Brandenburg Frankenstream
http://localhost:8001/?url=rtsp://stream2.rbb-online.de/encoder/antenne-live.ra

and drag it into iTunes. Voila, (fake) RealAudio/RTSP support in iTunes!

Subtleties

There are a few potential stumbling blocks to be aware of:

  • When streaming unlimited data over HTTP, you should not set the Content-Length header, as this will cause the client to close the connection after reading a set number of bytes rather than streaming forever.
  • When streaming and transcoding, Mplayer forks a child process to handle some of the work. If killed with SIGINT, the parent process will kill the child, but if killed with SIGKILL, it will exit immediately, leaving a zombie Mplayer every time the stream is stopped and restarted.

Saturday, December 06, 2008

Convert H264 Matroska files to MP4

Have you ever noticed how Matroska (*.mkv) support on OS X is, um, bad? Yes, Matroska is a wonderful container format in theory, but all the demuxers out there (okay, those in VLC and Perian) seem to be dog-slow. The combination of HD decoding and MKV demuxing brings my first-generation Intel mini to its knees. It seems much happier with MP4 files. Thankfully, one can extract the video and audio from MKV files and re-mux them into MP4 without transcoding. You'll need the following packages from MacPorts:

mkvtoolnix,ffmeg,mpeg4ip

You can grab yourself a coffee or 10 while those compile. Once done, the following script will convert MKV files to MP4 with a minimum of hassle:

#!/bin/sh
# Convert an H264 Matroska file to MP4
# hacked together by Jakob van Santen, December 2008

namelen=${#1}
let bnamelen=namelen-4
basename=${1:0:bnamelen}

info=`mkvinfo "$1"`
echo $info | grep AVC >/dev/null
if [ $? -eq 0 ];
then
    echo 'Found H.264 track'
else
    echo "I can only handle H.264 video. Bye!"
    exit 1
fi

if [[ "$info" =~ ([0-9]{2}\.[0-9]{1,3})\ fps ]]; then
    framerate=${BASH_REMATCH[1]}
    echo "Framerate = $framerate"
else
    echo "Couldn't get the framerate..bye!"
    exit 1
fi

mkvextract tracks "$1" 1:"$basename.h264"
ffmpeg -i "$1" -vn -acodec aac -ac 2 -ab 128 "$basename.aac"

mp4creator -c "$basename.h264" -rate=$framerate "$basename.mp4"
mp4creator -c "$basename.aac" -interleave -optimize "$basename.mp4"

rm "$basename.h264" "$basename.aac"

Sunday, September 07, 2008

flickr: Download photos from a group pool in bulk

It is sometimes useful to download all photos from a group pool in one fell swoop. Rather than clicking through all the photos by hand in a web browser, we can use the Flickr API to grab the photos quickly. Using the flickrapi python package, this looks something like the following:

# 
#  flickr_groupdump.py
#  Download photos from the Flickr group pool in bulk
#  
#  Created by Jakob van Santen on 2008-09-07.
# 
import flickrapi, os, re, urllib

# api key and secret
api_key = 'your api key'
api_secret = 'your api key secret'
flickr_username = 'your flickr (yahoo) username'

# the url of the group pool to be dumped
group_url = 'http://www.flickr.com/groups/876344@N22/pool/'

# initialize and get authentication token
flickr = flickrapi.FlickrAPI(api_key,api_secret,username=flickr_username)
(token,frob) = (token, frob) = flickr.get_token_part_one(perms='read')
if not token: raw_input("Press ENTER after you authorized this program")
flickr.get_token_part_two((token, frob))

# look up the group
group = flickr.urls_lookupGroup(url=group_url)
group_id = group.group[0]['id']
group_name = group.group[0].groupname[0].text

# get all the photos in the pool
page = 1
pages = page+1
group_list = []
while page <= pages:
    photos = flickr.groups_pools_getPhotos(group_id=group_id,extras='date_taken,original_format',page=page)
    page = int(photos.photos[0]['page'])
    pages = int(photos.photos[0]['pages'])
    print 'Got page', page, 'of', pages
    page += 1
    photolist = photos.photos[0].photo
    group_list += [p.attrib for p in photolist]
    
# classify the photo list by user
owners = {}
for photo in group_list:
    o = photo['ownername']
    if owners.has_key(o):
        owners[o].append(photo)
    else:
        owners[o] = []

# for each user who uploaded photos to the pool:
for owner_name in owners.keys():
    owners[owner_name].sort(lambda x,y: cmp(x['datetaken'],y['datetaken']))
    target = owners[owner_name]
    try:
        os.makedirs(group_name + '/' + owner_name)
    except:
        None
    # dump every photo in the pool to a file
    for index,photo in enumerate(target):
        existing_fname = filter(lambda fn: re.match("^%s .*" % photo['id'],fn),os.listdir(group_name + '/' + owner_name))
        if existing_fname == []: #photo doesn't yet exist, so download it
            sizes = flickr.photos_getSizes(photo_id=photo['id'])
            biggest = sizes.sizes[0].size[-1].attrib
            url = biggest['source']
            format = re.match(".*\.(\w{3})$",url).group(1)
            fname = group_name + '/' + owner_name + '/' + '%s %s (%s).%s' % (photo['id'],photo['title'],owner_name,format)
            def reporter(block_count,block_size,total_size):
                if block_count == 0:
                    reporter.datasize = total_size
            urllib.urlretrieve(url,fname,reporter)
            print os.path.basename(fname), 'downloaded', '%.3f MB' % (reporter.datasize/2.0**20)
        else: # rename the file for giggles
            new_fname = re.sub("^(%s) .* (\(%s\))(\.\w+)$" % (photo['id'],owner_name),r'\1 %s \2\3' % photo['title'],existing_fname[0])
            os.rename(group_name + '/' + owner_name + '/' + existing_fname[0], group_name + '/' + owner_name + '/' + new_fname)
            print os.path.basename(new_fname), 'exists'

You’ll need to get your own API key from flickr and insert it at the top of the script. Next, paste in the URL of the group photo pool. On the first run, you’ll have to authorize the key to access your flickr account if you haven’t already done so.

The script looks up the group based on its URL and builds a list of all photos in the pool. Then, it classifies the photos by owner. For each photo, the script fetches the largest available size and downloads it. Photos from each user are put in a different subfolder. The file name for each photo includes the photo ID from flickr, so if the photo already exists, it is skipped. If you run the script multiple times, it will only fetch the newly-added photos from the group pool.

The photo-gathering section can easily be modified to download photos from a particular user or set instead of a group.