I saw that [Git 1.7.0](http://git-scm.com/) has been released. For me, the most interesting feature is “sparse checkouts”, a feature that I use frequently in Subversion.
Working around patent threats
Andrew Tridgell, author of Samba, says the best way to defend against patents in open source software is to 1. learn how to read patents and 2. learn how to rigorously work around patents
* [http://lwn.net/Articles/370615/](http://lwn.net/Articles/370615/)
Open Source: Freedom from Anti-features
It’s good to remember that a benefit of open source software is freedom from anti-features. The wiki (the second link) has examples of anti-features. E.g. I wasn’t aware of the Vista anti-feature where it slows down network connections when it detects any sound playing.
* [http://lwn.net/Articles/370615/](http://lwn.net/Articles/370615/)
* [http://wiki.mako.cc/Antifeatures](http://wiki.mako.cc/Antifeatures)
f-spot and sqlite
I recently tried using Linux [f-spot](http://f-spot.org/), with the intent to make it easier to browse, manipulate, manage and publish my photos. I wanted f-spot to manage my photo screen saver as well. f-spot seems to be good at importing photos, but getting photos removed is a bit more difficult.
I organize my photos by date an a directory structure such as “2010/2010.01.01 New Years Day”. The “2010” directory contains several sub directories. Each sub directory consists of a date and a description. If, for some reason, I import photos into f-spot that I don’t want in its database, I know what directory the photos pertain to. Unfortunately, F-spot doesn’t allow me to remove photos from its catalog by filename or file path. That’s okay though, because it stores its database using sqlite.
I figured this out by running lsof -p pid-of-f-spot, and noticed a file descriptor opened to “/home/jared/.config/f-spot/photos.db”. Then I ran file ~/.config/f-spot/photos.db and it helpfully told me that it is a “[SQLite](http://www.sqlite.org/) 3.x database”.
After a bit of google research, I figured out I could install a SQLite manager on my Fedora system: yum install -y sqliteman, followed by running sqliteman ~/.config/f-spot/photos.db. I was expecting to see a command-line client, but to my surprise, I found a pleasant graphical interface. It was simple to browse the table schema and to run queries to update and morph the f-spot photo database. Note: I’d recommend making a backup copy of the database before altering it.
F-spot may not be everything I want it to be, but I managed to work past its limitations due to the fact that it used a well known, open data storage format.
Minimizing tracing/instrumentation overhead, injectso
Reading these articles from lwn.net: [Minimizing instrumentation impacts](http://lwn.net/Articles/365833/) and [Debugging the Kernel using Ftrace](http://lwn.net/Articles/365835/), reminded me of [Microsoft detours](http://research.microsoft.com/en-us/projects/detours/) and [Linux injectso](http://c-skills.blogspot.com/) (updated to work with current glibc, kernels).
Modern bug trackers
Five years ago, I started a new job and encountered the [JIRA](http://www.atlassian.com/software/jira/) bug tracking system, after having been subject to pathetic bug tracking systems at previous companies. JIRA knocked their socks off in terms of ease-of-use and multi-platform support (it runs in a web browser). I’ve been a pleased JIRA user ever since. Recently, I stumbled on this article about what’s new in some of the best quality bug tracking systems on the market.
> Bug (issue) tracking systems have become a standard tool for any organization that develops software and have evolved greatly in the last years. InfoQ has conducted a virtual panel with people from JIRA, FogBugz, Basecamp and MantisBT about this evolution and the future developments in this field.
The virtual panel discusses integration with IDEs, project planning, story-boarding, and social networking integration.
[Read more…](http://www.infoq.com/articles/bug-trackers)
Safety from patent threats via membership in OIN?
Here’s an article that I think is worth reading. It details how the Open Invention Network (OIN) keeps open source software safe from patent threats. It also explains about patent troll companies and their financial motives. It sounds like it’s worthwhile for companies that rely on OSS to become affiliated with OIN.
[http://lwn.net/Articles/353823/](http://lwn.net/Articles/353823/)
> Bergelt described Microsoft’s patent suit against TomTom as being a part of the software giant’s “totem strategy”. By getting various companies to settle patent suits over particular patents, Microsoft can erect (virtual) totem poles in Redmond, creating a “presumption of patent relevance”. According to Bergelt, Microsoft tends to attack those who try to create parity with it in some area, which TomTom did…. But, Microsoft was surprised to find that TomTom had allies in the form of OIN and others. Originally, Microsoft had asked for an “astronomical” sum to settle the suit, but after TomTom joined OIN and countersued Microsoft, the settlement number became much smaller.
OIN was started by six companies: Sony, IBM, NEC, Red Hat, Philips, and Novell.
Best technologies and productivity
I tend to wonder about the “best” technologies for a given problem. Recently, I’ve wondered why Wicket is reportedly better than Java Server Faces (though I’m using neither). Perhaps it’s human nature to look for the Next Big Thing or for silver bullet solutions that supposedly increase productivity while offering robust features.
Here’s a [blog post](http://www.jroller.com/kenwdelong/entry/my_framework_is_more_productive) that ponders whether a new framework or a programming language can really offer better productivity benefits over an ocean full of alternatives. The author asserts that the real time cost on a project is not in writing code, but in the following activities:
– Communication
– Understanding preexisting code
– Debugging
– Refactoring
Tools or languages that make any of those activities easier are to be coveted. Java refactoring tools outshine those available for Grails. Java is easier to read and comprehend than terse bash scripting. Some frameworks/platforms make debugging easier than others.
Using rsync with SELinux
Last week, I needed to move /home from one Fedora computer to another, and I used rsync over ssh move the data.
On the new system, I noticed that procmail didn’t seem to be working, and neither did Dovecot. Nor could apache serve up my files. This had all been working on my previous Fedora system, which was running SELinux, as was my new system. What had happened?
I hadn’t told rsync to bring across the SELinux file contexts, which are stored in extended attributes. Here is the rsync option I should have used:
-X, –xattrs
I could have used ‘tar’ to move my home directory as well. In that case, I would have needed one of the following options: `–selinux` or `–xattrs`
I resolved my SELinux issues using the excellent [SETroubleShoot](https://fedorahosted.org/setroubleshoot/), which explained what commands to run to restore the proper SELinux contexts on various files.
SELinux requires time to tune, and I use it because it enhances the security of my linux system, which serves up content over HTTP (Apache), IMAP (dovecot) and CIFS (Samba).
XML for documents, not for large data streams
I like XML, and I hate XML. XML is great because robust parsers already exist for nearly every programming language, thus saving work for programmers and reducing bugs. XML stinks because it’s not always the right tool for the job — it’s ugly, and it’s bulky. So when I read Michael E. Driscoll’s [comparison of documents (including XML) to trees and data to streams](http://dataspora.com/blog/the-rise-of-the-data-web/), it struck a chord with me:
> Trees are rooted and finite: you can’t chop up a tree and easily put it back together again. Streams can be split, sampled, and filtered. The divisibility of data streams lends itself to parallelism in a way that document trees do not. The stream paradigm conceives of data as extending infinitely forward in time. The Twitter data stream has no end: it ought have no end tag. Conceiving of data as streams moves us out of the realm of static objects and into the realm of signal processing.
He also [explains why XML shouldn’t be used for large data streams](http://dataspora.com/blog/xml-and-big-data/):
> XML is a poor language for data because it solves the wrong problems — those of documents — while leaving many of data’s unique issues unaddressed. But many promising alternatives exist — microformats like JSON, Thrift, and even SQLite’s file format.
I wouldn’t have thought of using SQLite’s file format — it has become somewhat ubiquitous. I admire Google ProtocolBuffers and Apache Thrift for offering open source, multi-language binary encoding for data. Now programmers won’t be as likely to reinvent the wheel, and they can rely on robust libraries.