Git Book, yap

The Pragmatic Bookshelf is releasing a [book on using Git](http://www.pragprog.com/titles/tsgit/pragmatic-version-control-using-git) for version control.

Steven Walter released a new command-line front-end for git called [yap](http://lwn.net/Articles/297285/). It’s not only supposed to make it easier to work with git, but also with subversion repositories. It’s available from [http://repo.or.cz/w/yap.git](http://repo.or.cz/w/yap.git)

MySQL or PostgreSQL?

I’ve often wondered why people seem to prefer either MySQL or PostgreSQL. For the most part, I think it comes down to the following:

* Familiarity.
* Friends (a.k.a. support system) being more familiar with one over the other.
* Ease of getting started. Most web hosting providers support MySQL out-of-the box.
* Name recognition.
* Ease of support.

Here are some resources that could be useful for learning the pros and cons of each database:

* [What MySQL can learn from PostgreSQL](http://www.scribd.com/doc/2575733/The-future-of-MySQL-The-Project)
* [What can PostgreSQL learn from MySQL](http://www.postgresonline.com/journal/index.php?/archives/48-What-can-PostgreSQL-learn-from-MySQL.html) and the [accompanying presentation](http://www.commandprompt.com/files/mysql_learn.pdf)
* [MySQL quirks and limitations](http://use.perl.org/~Smylers/journal/34246)
* [Why PostgreSQL?](http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007)

Effective forms of communication

Have you ever wondered what forms of communication are the most and the least
effective for software engineers? See Scott Ambler’s [“Models of Communication” diagram in his essay](http://www.agilemodeling.com/essays/communication.htm). Face-to-face is most effective, and paper is the least effective, with email, telephone and video conferencing falling in-between the two ends of the spectrum.

REST versus RPC

Have you considered the merits and applicability of RESTful web apps? Here are a few notes I’ve made.

There was quite a [discussion about RPC, REST, and message queuing](http://steve.vinoski.net/blog/2008/07/13/protocol-buffers-leaky-rpc) — they are not the same thing. Each one is needed in a different scenario. All are used in building distributed systems.

Wikipedia’s [explanation of REST](http://en.wikipedia.org/wiki/Representational_State_Transfer) is quite informative, especially their [examples](http://en.wikipedia.org/wiki/Representational_State_Transfer#Example) of RPC versus REST.

The poster “soabloke” says RPC “Promotes tightly coupled systems which are difficult to
scale and maintain. Other abstractions have been more successful in building
distributed systems. One such abstraction is message queueing where systems
communicate with each other by passing messages through a distributed queue.
REST is another completely different abstraction based around the concept of a
‘Resource’. Message queuing can be used to simulate RPC-type calls
(request/reply) and REST might commonly use a request/reply protocol (HTTP) but
they are fundamentally different from RPC as most people conceive it. ”

The [REST FAQ](http://rest.blueoxen.net/cgi-bin/wiki.pl?RestFaq) says, “Most applications that self-identify as using “RPC” do not conform to the REST. In particular,
most use a single URL to represent the end-point (dispatch point) instead of using a multitude of
URLs representing every interesting data object. Then they hide their data objects behind method
calls and parameters, making them unavailable to applications built of the Web. REST-based
services give addresses to every useful data object and use the resources themselves as the
targets for method calls (typically using HTTP methods)… REST is incompatible with
‘end-point’ RPC. Either you address data objects (REST) or you don’t.”

RPC: Remote Procedure Call assumes that people agree on what kinds of procedures they would like
to do. RPC is about algorithms, code, etc. that operate on data, rather than about the data
itself. Usually fast. Usually binary encoded. Okay for software designed and consumed by a
single vendor.

REST: All data is addressed using URLs, and is encoded using a standard MIME type. Data that is
made up of other data would simply have URLs pointing to the other data. Assumes that people
won’t agree on what they want to do with data, so they let people get the data, and act on it
independently, without agreeing on procedures.

Kodak Printers are Flawed

I’ve previously mentioned that I bought a Kodak all-in-one printer. When it prints, it prints beautifully. Ours didn’t print well for months, and I finally got around to calling their tech support, which was excellent. They sent me a new print head, which resolved my problem. Unfortunately, I had to stay on the line with tech support for over an hour, and I spilled permanent ink on my pants (it doesn’t wash out like they said it would).

Apparently, Kodak printers have a fundamental flaw in the design of the print head, and I’ll need a new print head about once per year — see the comments on [Kodak’s blog](http://cathieburke.pluggedin.kodak.com/default.asp?item=488521). Here’s one of them:

> Posted By: ThriftyTechie (7/30/2008)
>
> Before I tell you about my problem, I must say that the Kodak AiO printer phone support is excellent. I’ve had several experiences with your phone support (unfortunately) and every person has been helpful. Kudos. The bad news. It looks like that you have a couple of fundamental engineering flaws in your printer. 1. The non-disposable print head is just not durable enough. 2. The ink cartridges are too small. More frequent ink swaps are a) annoying for the consumer and b) can not possibly good for dependability, durability of the machine. My 5100 printhead failed completely after about 14 months after several months of sub par smeared and greyed-out printing. Lots of ink cartridges wasted on calibrating and testing. I was high on the product’s claims (i.e., save money on ink!), but this product certainly did not live up to the hype.

My time is valuable, so rather than spend over an hour on the line with tech support, I’d rather buy a different printer.

Core dump

The American Fork City sewage and composting plant is not far from the office where I work, and when the wind blows in this direction, we can smell the human output of an entire city.

It’s not usually a problem, and when it is, we don’t smell it from inside the office. Today is a overpowering exception, and it makes my stomach churn. It’s never been this bad before.

Perl one liners for email analysis

I thought it’d be interesting to know what times of day people were most likely to send me email. My email is stored in mbox format (I used Thunderbird and mutt for email), so I wrote a perl one-liner to analyze it for me.

The first one-liner prints a histogram, in 80 columns, of activity per-hour of the day. The second prints it in a form suitable for import into a spreadsheet

Histogram:

perl -nle ‘$sum[$1]++ if m/^Date: .* (\d\d):\d\d:\d\d/; END {foreach (@sum) { $max = $_ if $_ > $max }; $div = $max/80; foreach (@sum) { print $i++ . ” ” . (“#” x ($_ / $div)) . ” ($_)”;}}’ /path/to/Inbox

0 #################################### (115)
1 ########################## (84)
2 ################### (62)
3 ################ (54)
4 ############ (40)
5 ######### (31)
6 ####### (23)
7 ######################## (79)
8 ####################################### (126)
9 ############################################### (152)
10 ######################################### (133)
11 ###################################### (124)
12 ############################################################### (202)
13 ############################################################## (200)
14 ############################################################ (192)
15 #################################################################### (218)
16 ######################################################################## (229)
17 ################################################################ (206)
18 ################################################## (160)
19 ############################### (101)
20 ##################################### (118)
21 ######################################## (129)
22 ######################################################### (183)
23 ######################################## (129)

Tabular data:

perl -nle ‘$sum[$1] += 1 if m/^Date: \w{3}, \d+ \w{3} \d{4} (\d\d):\d\d:\d\d/; END {foreach (@sum) { print $i++ . “\t” . $_;} }’ /path/to/Inbox

While I was at it, I wanted to know what the most common timezone offsets were. Again, I wrote two separate one-liners. One prints a histogram, and the other doesn’t.

Histogram:

perl -nle ‘$tz{$1} += 1 if m/^Date: .*([+-]\d{4})/; END {foreach (values %tz) {$max = $_ if $_ > $max }; $div = $max/80; foreach (sort(keys %tz)) { print “$_ ” . (“#” x ($tz{$_}/$div)) . ” ($tz{$_})”; }}’ /path/to/Inbox

Non-histogram:

perl -nle ‘$tz{$1} += 1 if m/^Date: .*([+-]\d{4})/; END {foreach (sort(keys %tz)) { print “$_ $tz{$_}”; }}’ /path/to/Inbox

I subscribe to various email lists, and each has different characteristics. I was surprised to find that my family email box usage pattern was fairly spread out around the clock, except that it drops off significantly during dinner and during the wee hours of the morning. Evening hours are the most active.

I’ve taken the timezone one-liner and modified it to tell me the most common months of the year, or the most common days of the week for email to be sent. For all my email boxes, analyzed over the last few years, email is most active on weekdays, and drops off on weekends.

Mon ############################################################### (5630)
Tue ##################################################################### (6129)
Wed ######################################################################## (6372)
Thu ##################################################################### (6155)
Fri ############################################################ (5329)
Sat ############################## (2675)
Sun ########################## (2368)

I tried translating those one-liners into Ruby, but it wasn’t as compact, and doing it as a one-liner in Java just isn’t going to happen.

Perl 5 to 6

Moritz Lenz has written a series of informative blog posts about Perl 6, for Perl 5 programmers. Here’s a bit of his introduction:

> Perl 6 is underdocumented. That’s no surprise, because (apart from the specification) writing a compiler for Perl 6 seems to be much more urgent than writing documentation that targets the user.

> Unfortunately that means that it’s not easy to learn Perl 6, and that you have to have a profound interest in Perl 6 to actually find the motivation to learn it from the specification, IRC channels or from the test suite.

> This project, which I’ll preliminary call “Perl 5 to 6” (in lack of a better name) attempts to fill that gap with a series of short articles.

[Read more…](http://perlgeek.de/blog-en/perl-5-to-6/)

Google’s new web browser: Chrome

Google is [releasing](http://www.google.com/chrome) a beta web browser called “[Chrome](http://www.google.com/chrome)” tomorrow, and they’ve even got a [comic strip](http://www.google.com/googlebooks/chrome/) to explain the design choices they made, and how it’s supposed to make life better.

The browser is based on [WebKit](http://en.wikipedia.org/wiki/WebKit).
They aim to make JavaScript vastly faster with a new JavaScript virtual
machine called V8. At the same time, the Mozilla team is beefing up
Firefox 3.1 with a faster JavaScript engine called [TraceMonkey](http://www.pcmag.com/article2/0,2704,2328737,00.asp).

V8 and TraceMonkey reportedly race down the freeway while IE 7 and IE 8
are left puttering along at pedestrian speeds.