Saturday, December 06, 2008

Convert H264 Matroska files to MP4

Have you ever noticed how Matroska (*.mkv) support on OS X is, um, bad? Yes, Matroska is a wonderful container format in theory, but all the demuxers out there (okay, those in VLC and Perian) seem to be dog-slow. The combination of HD decoding and MKV demuxing brings my first-generation Intel mini to its knees. It seems much happier with MP4 files. Thankfully, one can extract the video and audio from MKV files and re-mux them into MP4 without transcoding. You'll need the following packages from MacPorts:

mkvtoolnix,ffmeg,mpeg4ip

You can grab yourself a coffee or 10 while those compile. Once done, the following script will convert MKV files to MP4 with a minimum of hassle:

#!/bin/sh
# Convert an H264 Matroska file to MP4
# hacked together by Jakob van Santen, December 2008

namelen=${#1}
let bnamelen=namelen-4
basename=${1:0:bnamelen}

info=`mkvinfo "$1"`
echo $info | grep AVC >/dev/null
if [ $? -eq 0 ];
then
    echo 'Found H.264 track'
else
    echo "I can only handle H.264 video. Bye!"
    exit 1
fi

if [[ "$info" =~ ([0-9]{2}\.[0-9]{1,3})\ fps ]]; then
    framerate=${BASH_REMATCH[1]}
    echo "Framerate = $framerate"
else
    echo "Couldn't get the framerate..bye!"
    exit 1
fi

mkvextract tracks "$1" 1:"$basename.h264"
ffmpeg -i "$1" -vn -acodec aac -ac 2 -ab 128 "$basename.aac"

mp4creator -c "$basename.h264" -rate=$framerate "$basename.mp4"
mp4creator -c "$basename.aac" -interleave -optimize "$basename.mp4"

rm "$basename.h264" "$basename.aac"

Sunday, September 07, 2008

flickr: Download photos from a group pool in bulk

It is sometimes useful to download all photos from a group pool in one fell swoop. Rather than clicking through all the photos by hand in a web browser, we can use the Flickr API to grab the photos quickly. Using the flickrapi python package, this looks something like the following:

# 
#  flickr_groupdump.py
#  Download photos from the Flickr group pool in bulk
#  
#  Created by Jakob van Santen on 2008-09-07.
# 
import flickrapi, os, re, urllib

# api key and secret
api_key = 'your api key'
api_secret = 'your api key secret'
flickr_username = 'your flickr (yahoo) username'

# the url of the group pool to be dumped
group_url = 'http://www.flickr.com/groups/876344@N22/pool/'

# initialize and get authentication token
flickr = flickrapi.FlickrAPI(api_key,api_secret,username=flickr_username)
(token,frob) = (token, frob) = flickr.get_token_part_one(perms='read')
if not token: raw_input("Press ENTER after you authorized this program")
flickr.get_token_part_two((token, frob))

# look up the group
group = flickr.urls_lookupGroup(url=group_url)
group_id = group.group[0]['id']
group_name = group.group[0].groupname[0].text

# get all the photos in the pool
page = 1
pages = page+1
group_list = []
while page <= pages:
    photos = flickr.groups_pools_getPhotos(group_id=group_id,extras='date_taken,original_format',page=page)
    page = int(photos.photos[0]['page'])
    pages = int(photos.photos[0]['pages'])
    print 'Got page', page, 'of', pages
    page += 1
    photolist = photos.photos[0].photo
    group_list += [p.attrib for p in photolist]
    
# classify the photo list by user
owners = {}
for photo in group_list:
    o = photo['ownername']
    if owners.has_key(o):
        owners[o].append(photo)
    else:
        owners[o] = []

# for each user who uploaded photos to the pool:
for owner_name in owners.keys():
    owners[owner_name].sort(lambda x,y: cmp(x['datetaken'],y['datetaken']))
    target = owners[owner_name]
    try:
        os.makedirs(group_name + '/' + owner_name)
    except:
        None
    # dump every photo in the pool to a file
    for index,photo in enumerate(target):
        existing_fname = filter(lambda fn: re.match("^%s .*" % photo['id'],fn),os.listdir(group_name + '/' + owner_name))
        if existing_fname == []: #photo doesn't yet exist, so download it
            sizes = flickr.photos_getSizes(photo_id=photo['id'])
            biggest = sizes.sizes[0].size[-1].attrib
            url = biggest['source']
            format = re.match(".*\.(\w{3})$",url).group(1)
            fname = group_name + '/' + owner_name + '/' + '%s %s (%s).%s' % (photo['id'],photo['title'],owner_name,format)
            def reporter(block_count,block_size,total_size):
                if block_count == 0:
                    reporter.datasize = total_size
            urllib.urlretrieve(url,fname,reporter)
            print os.path.basename(fname), 'downloaded', '%.3f MB' % (reporter.datasize/2.0**20)
        else: # rename the file for giggles
            new_fname = re.sub("^(%s) .* (\(%s\))(\.\w+)$" % (photo['id'],owner_name),r'\1 %s \2\3' % photo['title'],existing_fname[0])
            os.rename(group_name + '/' + owner_name + '/' + existing_fname[0], group_name + '/' + owner_name + '/' + new_fname)
            print os.path.basename(new_fname), 'exists'

You’ll need to get your own API key from flickr and insert it at the top of the script. Next, paste in the URL of the group photo pool. On the first run, you’ll have to authorize the key to access your flickr account if you haven’t already done so.

The script looks up the group based on its URL and builds a list of all photos in the pool. Then, it classifies the photos by owner. For each photo, the script fetches the largest available size and downloads it. Photos from each user are put in a different subfolder. The file name for each photo includes the photo ID from flickr, so if the photo already exists, it is skipped. If you run the script multiple times, it will only fetch the newly-added photos from the group pool.

The photo-gathering section can easily be modified to download photos from a particular user or set instead of a group.

Thursday, September 04, 2008

Altitude profiles with Google Earth

After a mildly strenuous, 8-day trek from Bavaria to Italy, it is desirably to take stock of one’s accomplishments by creating an altitude profile of the trip. If does not possess a GPS track of the trip, the coordinates can be extracted using one’s excellent knowledge of the terrain and the sometimes frighteningly detailed imagery in Google Earth. Most of the time, one can actually see the footpath and trace it out without a lot of guesswork. This is less true when tracing a path over boulder fields and glaciers, but these tend to be in the minority, I hope.

(Aside: why, oh why does Yahoo Maps use pictures taken at sunrise in the dead of winter for its detail images of the Austrian Alps? Useless! This wouldn’t be a problem if it weren’t for the fact that Flickr uses Yahoo Maps for its geotagging interface. I want a D90 with a GPS logger. Now.)

So, on to the procedure. First, create a new path that traces your route. Save it as a KML file, and examine the resulting file. The path itself is contained in the <coordinates> tag as a series of coordinates longitude,latitude,altitude (e.g. 11.26463484323443,47.3865106283369,0). The most obvious thing to notice is that the “altitude” as always 0. Google Earth simply does not store elevation data in KML files. This is eminently fixable, as elevation data for the more densely populated areas of the world exist and are freely available, thanks to the Shuttle Radar Topography Mission. The SRTM data give us an elevation for land areas between 60 degrees north and 56 degrees south on a grid of about 3 arc seconds (90 m). Instead downloading and parsing the data ourselves, we can query the copy hosted at GeoNames. The resulting altitudes can be inserted into the KML file, giving us complete data to work with. The idea came from the KML Altitude Filler, but it’s simple enough (and faster, really) to do it yourself. In Ruby, it would look something like this:

#!/usr/bin/env ruby
#
#  kml_altitude.rb
#  Add altitude data to coordinates in Google Earth files.
#  
#  Created by Jakob van Santen on 2008-09-04.
# 
require 'rubygems'
require 'hpricot'
require 'open-uri'

webservice = "http://ws.geonames.org/srtm3?lat=%s&lng=%s"
default_alt = "0"
null_alt = "-32768"

# magic fluff to deal with missing altitude points (happens occasionally in narrow passes and such)

def max(a,b)
  case a<=>b
    when -1 then b
    when 0 then a
    when 1 then b
  end
end

def search_for_valid_alt(arr,start_index)
  null_alt = "-32768"
  offset = 0
  hit = nil
  puts "searching for valid altitude around #{start_index}"
  while offset < max(start_index,arr.size-start_index)
    if arr[start_index-offset][2] != null_alt then
      hit = arr[start_index-offset][2]
      puts "valid alt #{hit} at #{start_index-offset}"
      break
    elsif arr[start_index+offset][2] != null_alt then
      hit = arr[start_index+offset][2]
      puts "valid alt #{hit} at #{start_index+offset}"
      break
    else
      offset += 1
      puts "offset=#{offset}"
    end
  end
  hit
end

fname = File.expand_path(ARGV[0])

kml = Hpricot.XML(open(fname))

coords = (kml/"coordinates")

coords.each do |single_coord|
  coord_arr = single_coord.inner_html.split(" ").collect {|c| c.split(",")}
  puts "Finding altitudes for #{coord_arr.size} points"
  coord_arr.collect! do |long,lat,alt|
    next unless alt == default_alt
    begin
    alt = open(webservice % [lat,long]) {|f| f.read}.chomp
    rescue Timeout::Error
      puts "timed out"
      retry
    end
    p [long,lat,alt]
    [long,lat,alt]
  end
  # deal with voids in the STRM3 data
  if coord_arr.detect {|d| d[2] == null_alt} then
    coord_arr.each_with_index do |coord,i|
      next if coord[2] != null_alt
      coord_arr[i][2] = search_for_valid_alt(coord_arr,i)
    end
  end
  # stick the modified coordinates back into the  element
  single_coord.inner_html = coord_arr.collect {|c| c.join(",")}.join(" ")
end

new_fname = fname.sub(/(.\w{3})$/,'-filled\1')
File.open(new_fname,'w') {|f| f.write(kml)}
puts "Wrote altitude data to #{new_fname}"

This script inserts altitudes in every <coordinates> element it finds in the KML file. Even after cleaning, there are still some voids in the SRTM3 data, which are returned as -32768. I only saw this happen once; the coordinates were in a fairly narrow pass. The search_for_valid_alt() method simply searches backward and forward along the path until it finds a valid altitude.

So, now we have an exact specification of each point on the globe. Now, we can simply calculate the distance between two points and plot it against the elevation. As it turns out, this is slightly more concise in Python (after figuring out how the heck PyXML works, that is):

# 
#  kml_profile.py
#  Create an altitude profile based on the coordinates in a Google Earth KML file
#  The file must contain one or more paths (in the proper order)
#  
#  Created by Jakob van Santen on 2008-09-05.
# 
import sys, os.path
import xml.dom.minidom
from numpy import *
from pylab import *
import scipy.interpolate

def cartesian(geocoords):
    long,lat,alt = geocoords
    r = alt + 6378135.0
    theta = pi*(90-long)/180.0
    phi = pi*lat/180.0
    return [r*sin(theta)*cos(phi),r*sin(theta)*sin(phi),r*cos(theta)]

def distance(coord1,coord2):
    return sqrt(sum((array(cartesian(coord1))-array(cartesian(coord2)))**2))

def profile(coords):
    ctext = coords.firstChild.wholeText
    coord_arr = [[float(f) for f in el.split(",")] for el in ctext.split()]
    d = [distance(c,coord_arr[i]) for i,c in enumerate(coord_arr[1:len(coord_arr)])]
    linear = [0]
    def adder(x,y):
        # print x,y
        linear.append(x+y)
        return x+y

    reduce(adder, d, 0)

    z = [c[2] for c in coord_arr]

    # calculate the elevation gain using a smooth spline
    tck = scipy.interpolate.splrep(linear,z,s=3e2*len(linear))
    smoothz = scipy.interpolate.splev(linear,tck)
    up = 0
    down = 0
    for i,alt in enumerate(smoothz[1:len(smoothz)]):
        diff = alt-smoothz[i]
        if diff<0:
            down -= diff
        else:
            up += diff

    return [linear,z,smoothz,up,down]

def flatten_1(list):
    out = list[0]
    for el in list[1:]:
        out = concatenate((out,el),axis=1)
    return out

fname = sys.argv[1]
kml = xml.dom.minidom.parse(fname)
coords = kml.getElementsByTagName("coordinates")    
# create an elevation profile for each  block
results = map(profile,coords)

# concatenate altitude arrays for each path
z = flatten_1([res[1] for res in results])
smoothz = flatten_1([res[2] for res in results])
up = sum([res[3] for res in results])
down = sum([res[4] for res in results])
# create a linear distance scale for the combined path (each individual path starts from 0)
unadjusted_linear = [0].extend([res[0] for res in results])
linear = results[0][0]
for subaxis in [res[0] for res in results[1:]]:
    shifted = [el+linear[-1] for el in subaxis]
    linear.extend(shifted)

# plot it   
plot(array(linear)/1000.0,z)
# plot(array(linear)/1000.0,smoothz)
grid()
title("Elevation profile for %s" % os.path.split(fname)[1])
xlabel('linear km')
ylabel('m above sea level')
figtext(0.15,0.75,'km: %.2f\nm up: %d\nm down:%d' % (linear[-1]/1000.0,int(up),int(down)))
show()

There are a lot of scipy/matplotlib/numpy dependencies in here, but that’s what I happen to have lying around. The meat of the script is in the profile() method, which takes a <coordinates> DOM element. It takes each coordinate pair, converts it from equatorial spherical coordinates (in degrees) to polar spherical coordinates (in radians), then to cartesian coordinates (about the center of the earth), then calculates the distance between the two points in meters. It then adds up the distances to create a linear distance scale and extracts the elevation from each point into a separate array. Voila, an elevation profile!

As an added bonus, it also calculates the elevation gain and loss over a path, using a cubic spline instead of the actual data points. When the path is traced exactly (e.g. including every curve and switchback), the elevation tends to fluctuate wildly by a few meters, resulting in up to a kilometer of “extra” elevation gain and loss over the course of a day.

The other bits of the script just handle concatenation multiple paths, as I had my trek split up by days. Finally, we plot the final product, which will look something like this (if, for example, you hike from Mittenwald (Bavaria) to Luttach (South Tirol) via Wattens (Tirol)):

Elevation profile

Enjoy!

Monday, March 31, 2008

PDF imposition on OS X with Quartz and RubyCocoa

Since I banged my head against this for quite a bit before coming to a satisfactory solution, I figured I’d do a writeup.

Abstract:

Imposition is the process of arranging pages on a large sheet such that they appear in the proper order when cut and folded into a booklet. PDF, being a page-oriented format, lends itself particularly well to being shuffled. While a number of different pieces of software exist to do this (freeware, shareware, and as Adobe Acrobat plugins), none of them come with source code to be tinkered with. This article discusses the development of an imposition script using only the Mac OS X drawing API and the RubyCocoa bridge.

This has the advantage of a) being freely modifiable and b) working out of the box on Mac OS 10.5. It should run just fine on 10.3 and later, provided that the RubyCocoa bridge is installed. The script provides a framework for re-ordering and arranging the pages of a PDF for booklet printing. An example subclass for printing DIN A6 (landscape) booklets is provided. More complicated imposition schemes and some sort of user interface are left as an exercise to the reader.

Problem:

I want to print DIN A6-sized booklets of 8 pages each. To do this, I need to rearrange the pages of my source PDF and print them 4-up such that they appear in the proper order when the page is cut in half and folded. This is known as imposition.

While there are a number of utilities that do this (CocoaBooklet and Cheap Impostor [or heck, even the pdfpages package for LaTeX] come to mind), I have waaaay too much time on my hands at the moment. Sometimes reinventing the wheel can provide a pleasant amount of mental exercise. So, I set out to script my way out of this.

Hypothesis:

Preview.app in 10.5 allows you to rearrange PDF pages via drag-and-drop. On 10.4 at least, every feature of Preview.app has a corresponding method in PDFKit. Therefore, it must be relatively easy to load up an instance of PDFDocument, shuffle the pages around, and spit it back out to a new file. Added bonus: with RubyCocoa now included in OS X by default, I can get up and running fairly quickly.

First Attempt:

It turns out that this is, in fact, relatively easy. The following code will spit out a PDF with pages in the proper order:

#!/usr/bin/env ruby

inpath = File.expand_path("~/Desktop/somefile.pdf")
outpath = File.expand_path("~/Desktop/somefile_imposed.pdf")

require 'osx/cocoa'
OSX.require_framework('Quartz')
include OSX

# create PDFDocument instance
pdf = PDFDocument.alloc.initWithURL(NSURL.fileURLWithPath(inpath))
# peel off the pages into our own array
pages = []
pdf.pageCount.times { pages << pdf.pageAtIndex(0); pdf.removePageAtIndex(0)}
# reinsert the pages in the desired order
[8,1,6,3,2,7,4,5].reverse.each {|old_page_no| pdf.insertPage_atIndex(pages[old_page_no-1],0)}
# spit out the rearranged PDF
pdf.writeToURL(NSURL.fileURLWithPath(outpath))
# open the rearranged PDF
system('open',outpath)

Printed 4-up with short-edge binding, the output will look something like this:

So far, so good. What if we want more control? The above was printed 4-up in “normal” order (rows right to left) using the OS X print dialog. What if we want a different order for some reason? Naturally, the easiest way to do this is through the print dialog. If possible, though, I wanted to do this programatically. That way, went my logic, I could run the whole shebang in one step. As it turns out, this is a rather thorny problem, at least the way I went about it.

Second Attempt:

It seemed simple enough: rearrange the PDF as in the previous step, then run it through the printing system to create an N-up PDF. However, there’s a bit more involved to be able to draw content for the printing system, at least in Cocoa:

  1. Create an NSWindow somewhere offscreen to support the drawing operation
  2. Create an NSView subclass to display the content and attach it to the window
  3. Create a print job for the view and set the proper options (4-up, order, etc.)
  4. Run the print job and enjoy your tasty imposed PDF

The first hurdle came when I tried to drop a simple PDFView into this scheme. PDFView has a somewhat quirky pagination scheme, such that I couldn’t get NSPrintOperation printOperationWithView:Options: to work correctly. So, I put on my wheel-reinvention hat and implemented a my own view based on NSImageView and NSPDFImageRep with a custom pagination scheme and all that jazz. So, that was up and running. Now, on to the print job. For the life of me, I couldn’t figure out how to set the print options to produce N-up pages. In fact, none of the options I set via NSPrintInfo seemed to stick. I began to despair. In the course of writing my NSImageView subclass, I set up a global variable ($debug=true) to switch my debugging statements on and off. In a remarkable stroke of seredipity, RubyCocoa turned this into an environment variable, and I started seeing debugging output from the cgpdftopdf CUPS filter, the component actually responsible for creating N-up PDF content.

Third Attempt:

We’ve already met the Ansatz for my third attempt. There is, unfortunately, zero documentation for the built-in CUPS filters, as they’re not intended to be used directly. Still, I set about trying to figure out how to call cgpdftopdf directly. The DEBUG environment variable helpfully spit out all the command-line arguments, and I faithfully copied them to my own script. No dice. I could get it create new PDF files, but no amount of fiddling with the arguments got me closer to my goal. So, instead of blindly fiddling, I examined the cgpdftopdf binary in hopes of finding some hints as to which arguments it accepts. Nothing. Just a bunch of calls to functions beginning with CGPDFContext. Hmm. What if I really waste time and re-implement the portions of cgpdftopdf that create N-up content?

Fourth Attempt:

Quartz to the rescue! As it turns out, PDF drawing isn’t hard, it’s just hard in Cocoa. Luckily, RubyCocoa doesn’t just cover Cocoa/Objective-C, but also the plain-jane C portions of the Application Kit like Core Graphics (Quartz). I absolutely love using Ruby as a bridge language. Using C libraries without having to worry about typing or memory management or pointers is just happy.

Gushing aside, it turns out that I lot of the things I was trying to do towards the end of Attempt 2 are a lot easier in pure Quartz than wrapped in Cocoa. Sure, it lacks that object-oriented goodness, but layered drawing is inherently stateful and procedural anyhow. For example, instead of mucking about with offscreen windows and NSImageViews, I simply grab a CGContext for my drawing by calling context = CGPDFContextCreateWithURL(CFURLCreateWithFileSystemPath(nil,dest_path,KCFURLPOSIXPathStyle,0), page_rect, nil), where dest_path is a some filesystem path and page_rect is a CGRect giving the page size. In place of a CGRect struct, I can pass an array of Numerics like [[0,0],[841.88,595.28]], which walks and quacks just like a CGRect (may duck typing be blessed). I then pass this context as the first parameter to all of my drawing functions, and Quartz builds my PDF for me. Most wonderful, however, is the utility function CGPDFPageGetDrawingTransform, which calculates the CGAffineTransform that puts a given page inside a given rectangle. If these sorts of goodies are exposed in the Cocoa drawing API, I couldn’t find them. It is of course entirely possible that they’re not, since Objective-C can mix in normal C with ease.

So, all that’s left to do to N-up-ify my PDF is to calculate bounding rectangles for each sub-page, pass them to CGPDFPageGetDrawingTransform, tack the result on to the current transformation matrix, and draw the page. The final product:

#!/usr/bin/env ruby
#
#  impositor.rb
#
#  Created by Jakob van Santen on 2008-03-29.
#  Copyright (c) 2008 __MyCompanyName__. Some rights reserved.
#  This code is distributed under the terms of the 
#  Creative Commons Non-Commercial Share-Alike Attribution license.

require 'osx/cocoa'
OSX.require_framework('Quartz')
include OSX

class PDFImposition
  # ways of arranging the subpages on the page
  class NUPMode
    RowsLeftToRight = 0
    ColumnsLeftToRight = 1
    RowsRightToLeft = 2
    ColumnsRightToLeft = 3
    Normal = RowsLeftToRight
  end
  attr_accessor :nup_rows, :nup_columns, :nup_mode
  def initialize(source_path,dest_path)
    @source_pdf = CGPDFDocumentCreateWithURL(CFURLCreateWithFileSystemPath(nil,source_path,KCFURLPOSIXPathStyle,0))
    @page_rect = CGPDFPageGetBoxRect(CGPDFDocumentGetPage(@source_pdf,1),KCGPDFMediaBox)
    @context = CGPDFContextCreateWithURL(CFURLCreateWithFileSystemPath(nil,dest_path,KCFURLPOSIXPathStyle,0), @page_rect, nil)
    @nup_rows = 1
    @nup_columns = 1
    @nup_mode = NUPMode::Normal
    @rotation = 0
    @imposition_map = (1..CGPDFDocumentGetNumberOfPages(@source_pdf)).to_a 
  end
  # calculate a bounding rect for page n
  def rect_for_page(n)
    row,col = position_for_page(n)
    size = scaled_page_size
    [[col*size.width,row*size.height],[size.width,size.height]]
  end
  # should the pages be rotated?
  def rotate?
    page_aspect = (@page_rect.size.width.to_f/@page_rect.size.height)
    cell_aspect = page_aspect*(@nup_rows.to_f/@nup_columns)
    (page_aspect-1)/(cell_aspect-1) < 1 # only if the aspect ratio flips
  end
  # size of each subpage
  def scaled_page_size
    full_size = [@page_rect.size.width.to_f,@page_rect.size.height.to_f]
    full_size.reverse! if rotate?
    CGSize.new(full_size[0]/@nup_columns,full_size[1]/@nup_rows)
  end
  # position of each subpage in the grid (row index, column index as measured from the origin)
  # multiplying this by the page size yields the bounding rect for the page
  def position_for_page(n)
    index = n-1
    position = case @nup_mode
      when NUPMode::RowsLeftToRight
        [@nup_rows-((index/@nup_columns) % @nup_rows) - 1,index % @nup_columns]
      when NUPMode::ColumnsLeftToRight
        [@nup_rows - (index % @nup_rows) - 1,(index/@nup_rows) % @nup_columns]
      when NUPMode::RowsRightToLeft
        [@nup_rows-((index/@nup_columns) % @nup_rows) - 1,@nup_columns - (index % @nup_columns) - 1]
      when NUPMode::ColumnsRightToLeft
        [@nup_rows - (index % @nup_rows) - 1,@nup_columns - ((index/@nup_rows) % @nup_columns) - 1]
    end
    position
  end
  # override this method to provide an imposition scheme
  def imposition_map
    (1..CGPDFDocumentGetNumberOfPages(@source_pdf)).to_a.collect {|p| [p,0]}
  end
  def run
    per_page = @nup_rows*@nup_columns
    page_counter = 0

    imposition_map.each_with_index do |map_entry,index|

      page_no,angle = *map_entry

      if page_counter == 0 # start of page
        CGContextBeginPage(@context, @page_rect)
      end

      unless page_no.nil? # page_no = nil results in a blank page
        CGContextSaveGState(@context)
        page = CGPDFDocumentGetPage(@source_pdf,page_no)
        CGContextConcatCTM(@context,CGPDFPageGetDrawingTransform(page,KCGPDFMediaBox,rect_for_page(index+1),(rotate? ? -90 : 0)+angle,true))
        CGContextDrawPDFPage(@context, page)
        CGContextRestoreGState(@context)
      end
      # uncomment to draw a border
      # CGContextStrokeRectWithWidth(@context,rect_for_page(index+1),2.0)
      page_counter += 1
      if page_counter == per_page # end of a page
        CGContextEndPage(@context)
        page_counter = 0
      end
    end

    if page_counter != per_page # didn't hit the end of a page
      CGContextEndPage(@context)
    end
  end
end

# an example subclass for creating A6 (landscape) booklets
class Invite < PDFImposition
  def initialize(*args)
    super
    @nup_rows = 2
    @nup_columns = 2
    @nup_mode = NUPMode::RowsLeftToRight
  end
  def rect_for_page(n)
    rect = super
    # pad each subpage by 12 points
    # this could be modified to account for ``creep'' in thick signatures
    [rect[0].collect {|p| p + 12},rect[1].collect {|p| p - 24}]
  end
  def imposition_map
    per_page = @nup_rows*@nup_columns

    pages = (1..CGPDFDocumentGetNumberOfPages(@source_pdf)).to_a
    # if the page count is not a multiple of per_page, pad it out with nils
    pages << nil until pages.size % per_page == 0

    imap = []
    until pages.empty?
    # recto
     imap += [pages.delete_at(-1),pages.delete_at(0),pages.delete_at(-2),pages.delete_at(1)].collect {|p| [p,0]}
     break if pages.empty?
     # verso (upside-down in long-edge duplex printing)
     imap += [pages.delete_at(-2),pages.delete_at(1),pages.delete_at(-1),pages.delete_at(0)].collect {|p| [p,180]}
    end
    imap
  end
end

imp = Invite.new("/Users/superjakob/Desktop/einladung.pdf","/Users/superjakob/Desktop/ruby_cfout.pdf")
imp.run

In the end, I spent way too much time implementing something that could have been done by hand. Still, I now have a useable framework for implementing arbitrary imposition schemes. One could write a script that uses a subclass of PDFImposition (along with some extra housekeeping like deleting the spool file) and install it in the PDF menu of the print dialog. Come to think of it, that’s kind of what CocoaBooklet does. But does it come with source code?

The take-away

  • Your mother was right. You spend 90% of your time on the last 10% of functionality.
  • The OS X drawing API is pretty neat, once you drop down to an appropriate level. Never use NSViews when you have no intention of drawing to the screen.
  • This is almost taken for granted these days, but PDF support in OS X? bella!
  • Apple done right with the BridgeSupport project. Making the Application Kit available to unskilled monkeys like me is a Good Thing. I think.