mtnwestruby: Black-boxing with Ruby

Mountain West Ruby Conference: SBN
Tag: mtnwestruby
16 March 2007

James Britt – Black-boxing with Ruby

Reinventing the wheel is overrated. Why not reuse what other people have already implemented, even from other languages?

Trac: A good project tracking system. Requires too much use of the mouse.
What if: I could use Trac from the command line? Tracula.

Tracula is written in Ruby and uses Hpricot & Mechanize to pretend to be a web browser. Unfortunately, tracula is brittle because the web page UI tends to change — it’s not an API they’re publishing. Many features are missing from Tracula, but that’s ok — he writes the functionality as he needs it, and stubs out missing behavior with hardcoded values.

Tracula acts as a proxy, so the code using tracula doesn’t break just because trac changes.

WordPress: A best of breed web log system.
What if: I could have the great comment system for the Django Book?
www.djangobook.com. Uses Yahoo UI and Yahoo.EXT.
Comet is written in Ruby. Comments made by users are redirected via Apache modprox through comet, which takes the comment, and posts it to WordPress.

Lessons learned:

  • WordPress plugins are the way to go.
  • Repurposing other people’s code can bite your hard.
  • Most web sites offer APIs, including TRAC. Use them, but still use a proxy layer so your software is insulated from changes in the API.
  • Proxies can add exception handling that is missing from APIs.

Techniques:

  • Screen scraping
  • DOM munging
  • Proxies, proxies, proxies!
  • Save off a copy of web pages while you are developing your HTML screen scraper, and hit those saved pages so that you don’t run up someone’s bandwidth bill.