What’s happening with Version Control Systems?

I've long had an interest in version control systems (VCS), also known as source code management (SCM) systems -- beginning with RCS, SCCS and CVS. CVS was already showing it's age when I started using it in 1998. When the company I worked for, Axent, was acquired by Symantec in 2000, we switched to using Perforce. At first, I thought Perforce was a step backwards from CVS. After using it heavily for a few months, it was clear that CVS and WinCVS didn't come close to the ease-of-use and features of Perforce and p4win. CVS was dreadfully slow compared to Perforce, which was lightning fast (and still is).

Perforce encourages third-party developers to develop add-ons for use with their software, which is almost as good as what you get with an active open-source project. Alough Perforce is proprietary, it's about as open as I've ever seen a commercial project. It runs on many platforms, has conversion scripts to migrate CVS repositories to Perforce, etc. It's not cheap, unless you're working on an open-source project, in which case, you can get free licenses to use it.

At some point, I heard about the Subversion project, which aimed to correct many of the deficiencies of CVS. Those were the pre-1.0 days, and it was interesting to watch the development of Subversion.

About the same time, Bitkeeper was in the news. It was different than CVS, Subversion and Perforce because it was a distributed version control system. The idea appealed to me because of the idea that a developer could have version control for his/her private changes without having to check-in to the main repository until they were ready. At that time, there weren't any mature open-source distributed version control systems to investigate.

I switched jobs late in 2004, and my new company was using Subversion. Overall, I have been very pleased with Subversion in day-to-day use. It's much better than CVS. We had some reliability problems with the Subversion server. It was running on Windows with the BDB database storage back-end. When it was switched to a Linux server with the FSFS back-end, it became much more reliable. My team uses TortiseSVN -- an excellent user interface that integrates with Windows Explorer.

I've periodically kept tabs on version control systems. Many open-source variants have sprung up over the last few years: Mercurial, Bazaar-NG, Git/Cogito, Darcs, SVK, Arch and Monotone. Lately though, I haven't seen any great reviews on which ones are the most mature, or what the pros and cons are of each. So, I've done some google research to figure it out, focusing primarily on the distributed variants.

The conclusion I've come to is that the developers of each version control system are learning from the developers of the other version control systems, and each project is improving. The Subversion developers are learning from the distributed version control developers. Recently, there was an SVN developer summit and they tried out Mercurial, which tells me that there's merit to the distributed approach.

If you're already using a modern version control system, the cost to switch may outweight the benefit. Organizations seem to be able to cope with legacy tools like Visual SourceSafe and CVS, although better tools can make developer's lives easier.

Here's my own highly subjective comparison table. I've marked, in red, some of the things I think are noteworthy. I focused my efforts on the compeitors that seem to have garnered the most community adoption. I've included one commercial system, Perforce. Each item is rated on a scale of 1 to 10, 10 being the best. (Update: There's a better table than mine at http://bazaar-vcs.org/RcsComparisons and various comparisons at Wikipedia)

Comparison of Source Code Management systems

January 31, 2007 Subversion SVK Git/Cogito Mercurial Bazaar-NG Darcs Perforce Notes
Command-line name svn svk git / cg hg bzr darcs p4
Cross-Platform 10 9 6 10 10 9 10 Windows, Linux, Mac, Solaris, etc.
Maturity 9 6 8 7 5 8 10 Maturity based on lifetime, and project flux in code
Maturity: GUI 9 0 5 4 3 1 10
Disconnected/offline operation 2 10 10 10 10 10 0 Disconnected 1. editing of files, 2. branching, 3. merging, 4. history, etc. Especially handy when there's no network connectivity, such as when on an airplane.
Community Adoption 10 2 8 7 5 2 1
Documentation Quality 10 7 7 8 6 8 10
Storage Format: Robustness 5 5 10 8 7 5 5 Storage format least susceptible to corruption.
Storage Format: Not in flux 1 1 10 8 1 1 ?
(re)Merging support 0 9 9 9 9 10 4 Remembers prior merges, cherry-picking, etc.
Repository Size 1 9 10 9 ? ? ?
Speed 2 7 10 8 6 10
Scalability 9 9 10 9 5 5 9
Commercial Backing 10 5 10 10 10 5 10
Subversion Integration 10 8 6 5 4 4 ? Tailor can be used to migrate changes between all systems
Totals: 88 87 119 112 81 68 79

If I were to pick a VCS system today, it would probably be Git, followed by Mercurial. What follows are my unpolished notes and ideas.

Git/Cogito

Git is very scalable, and is the fastest open-source version control system available. Git has a wide community of professional engineers supporting it, and it has a bright future. There are graphical user interfaces available for Git such as gitk and qgit, although none of them are as mature as the user interfaces available for Subversion. Cogito is the easy-to-use command-line wrapper around git. See also the Cogito Wiki. According to Keith Packard of xorg fame, Git has the most robust/reliable repository storage format. Advantages of git and all distributed VCSes include 1. offline repository access, 2. private branches, 3. distributed backups including change history.

For those wishing to use Git/Cogito on Windows, use Cygwin and select the git and/or cogito packages and read the information at http://git.or.cz/gitwiki/WindowsInstall. For those organizations wishing for excellent Windows-Explorer integration, use git-cvsserver in combination with TortiseCVS.

To install git and cogito on Fedora, run the following as root:
  yum install git cogito qgit

I've reluctantly decided that Git isn't as mature as Subversion, which shouldn't be surprising because Subversion has been around for longer. Git isn't the right fit for all projects. Git was designed for monolithic code bases, not for modular code bases, although work is in progress to allow it to support sub projects (similar to svn:externals). "Such flexibility is an implicit feature of centralized SCMs, but is much more difficult to implement in a distributed system like git. As a result, git currently lacks built-in subproject support, although gitweb does have a notion of subprojects."

There's a document that describes Common Mistakes made when using Git. Unfortunately, most of it isn't written yet -- there's only a loose outline.

Tutorials:

Tools -- See http://git.or.cz/gitwiki/InterfacesFrontendsAndTools

Mercurial

The OpenSolaris project decided between Bazaar-NG, Git and Mercurial. Mercurial was chosen primarily because 1. it was fast (although Git is faster), 2. the Mercurial developers were very responsive to the OpenSolaris developers and 3. OpenSolaris developers felt like they could hack Python code, and 4. the repository format works well with ZFS & NetApp filesystem snapshotting. Their evaluation of Git is here, and it looks like the listed downsides are now out-of-date or superficial. The Mozilla project had a "version control shootout", and although they haven't yet made a decision, Mercurial and Bazaar-NG sounded the best to them.

The following has diagrams to illustrate distributed merging:
http://www.selenic.com/mercurial/wiki/index.cgi/UnderstandingMercurial

Mercurial is more mature than Bazaar-NG, and Mercurial is faster:
http://sayspy.blogspot.com/2006/11/bazaar-vs-mercurial-unscientific.html

"Technologically, centralized systems are a single point of failure-- any problems with the central server are problems for all people using it." -- http://bazaar-vcs.org/WhyUseBzr

Mercurial supports access control, email notify, line-ending conversion, etc.:
http://www.selenic.com/mercurial/wiki/index.cgi/UsingExtensions

SVK

SVK is built on top of Subversion, so it should, in theory, integrate well with an existing Subversion repository, allowing developers to use a distributed tool even if the master server remains a Subversion server. Community adoption is high enough to have some confidence in the future of the project, although adoption isn't nearly as high as with Git, Mercurial or Bazaar-NG.

It used to be difficult to install, but you can now get a prebuilt installer for Windows and probably for Linux as well. Working copies (sandboxes) have no extra meta data (no .svn directory which interfere with find, etc.) The repository format is significantly smaller than with Subversion. I've found that SVK is much faster than Subversion, although I haven't used it much. There is not yet a graphical user interface -- a must for many organizations/communities.

The good, the bad and the ugly about SVK (Sept 2006): http://kitenet.net/~joey/blog/entry/svk.html

Darcs

Users of darcs, including myself, appreciate its simplicity and ease-of-use (note: Cogito, Mercurial and Bazaar-NG are also easy to use). Downsides of darcs are that 1. Darcs is implemented in Haskel, which limits the contributing developer community (perhaps it will inspire people to learn Haskel), 2. depends on having Haskel libraries installed and 3. there's no graphical user interface, unless you consider darcsweb. Still, I like darcs, and I use it on my home linux box. Like Perforce and SVK, darcs doesn't clutter up directories with .darcs metadata. It used to be that Darcs wasn't very scalable, but I hear that it's become much more scalable as of mid-2006. I've read that Mercurial and Darcs feel somewhat similiar in their command-line user interface.

Mirroring Subversion with Darcs and Tailor (Sept 2006): http://fiatdev.com/articles/2006/09/10/mirroring-subversion-with-darcs-and-tailor

Subversion

Subversion has a bright future, I think, and we may yet see some of the advantages of distributed systems appear. For those who need merge history tracking, which makes future merges from the same branch easier, there's svnmerge.py. In a future release, Subversion will have this feature built-in.

The Subversion 1.4 release brought impressive speedups for working copy operations.

Control/Power

Changing information flow by switching from a centralized system to a distributed system will empower or disempower different sets of people. I wouldn't be surprised if one encounters resistance in switching.

In the centralized model, developers are empowered to make any change they want, which may affect everyone, without consulting others. Of course, if they abuse that power, they may lose commit access. With a distributed system, an integrator pulls in people's changes based on what and whom they trust. If you're aiming for quality code that doesn't destabilize a system, it sounds like a good approach, and it works well for Linux kernel development. Most distributed systems can be used similiar to a centralized system, so that no integrator is required -- individuals can push their changes to the master repository.

HOWTO Make Windows XP unusable

A friend of mine was cleaning out what he thought was cruft from his c:\Windows\System32 directory when he deleted oembios.dat. His computer failed to boot after that, and a system restore disk didn't help. Although he could boot into a command prompt, he couldn't boot up in safe mode. He fixed the problem by copying an oembios.dat file from another computer. Read more about this here. The oembios.dat file may be related to Windows Product Activation.

Phishing Fraud in 2007

Netcraft: Phishing Attacks Continue to Grow in Sophistication
http://tinyurl.com/vwmvw

"The Year in PhishingPhishing attacks are continually evolving, as fraudsters develop new strategies and quickly refine them in an effort to stay a step ahead of banking customers and the security community. Here are some of the phishing trends and innovations we noted in 2006"

  • Plug and Play Phishing Networks
  • Phlashing (Flash-based phishing sites)
  • Two-factor Authentication: A July attack on Citibank demonstrated a technique that was able to defeat two-factor authentication tactics using a man-in-the-middle attack.
  • Hacked Bank Sites
  • Continued XSS (cross-site-scripting) Vulnerabilities
  • MySpace Phishing

Read the article for more details. Is safe to do online banking? I know people who say "no". If someone hacks into your bank account and commits fraud, who bears the burden of proof? You or the bank? Probably you. Who limits your liability? Not the bank. Credit card companies limit customer liability to a reasonable minimum, but with online banking, there is no such protection. If you physically visit a bank office and fraud happens, at least there are records of who did what (video camera recordings, records of which bank teller was helping with the transaction, etc.) With online banking, most of those audit records don't exist.

No-hassle online backup software

No-hassle online backup software for Windows XP: http://mozy.com and http://carbonite.com. Five dollars per month. Not bad.

I heard about these from listening to this podcast on usability of software

Why Software Sucks by David Platt
http://cdn.itconversations.com/ITC.TM-DavidPlatt-2007.01.02.mp3

What is the most important thing to the average computer user? They want their machine to "just work". Why does Google know how to correctly translate a United Parcel Service tracking number, while the actual UPS website requires multiple entries just to get to the point where the tracking number can be entered? Programmer David Platt is the author of "Why Software Sucks...and What You Can Do About It".

While average users are expected to use the computer as an everyday tool, programmers too often produce software that has poor functionality, especially compared to other devices used to perform other routine tasks.

One of the other major problems is that software is too often marketed to enterprises rather than individuals, and that constant updates are meant to convince companies to regularly upgrade, with little or no thought given to the end user.

The discussion is both enlightening and entertaining. While Platt believes the problem can be solved, he thinks it won't happen unless software designers change their point of view to better consider the needs of the end user.

Sony Clie Fixed

A few months ago, my Sony Clie PEG T-615C stopped hot-syncing and stopped charging. I would have backed up to a memory stick, but the slot was destroyed a couple of years ago when my then-two-year old son tried to jam the stylus into the wrong spot. I lost some data when the battery finally gave out. I used my multi-meter to check that the power supply cable was functioning. It was okay. A connection inside the Clie was probably broken.

Since then, I've been borrowing a friend's PEG-NZ90. It mostly works and runs faster, but is an ugly beast of a machine. I liked the slim, sleek form-factor of my T615C.

Tonight, I decided to open up the broken Clie and see if I could spot anything obviously wrong, but I couldn't. Still, seeing the innards was fascinating.

I was impressed at how tiny the parts were -- the ICs, the resistors, the diodes and who-knows-what-else. The miniaturization is amazing, and seeing it with my own eyes leads me to appreciate the raw power we hold in our hands. This thing is more powerful than the first Macintosh computers were just twenty years ago... or would be, if it worked.

Past experience with computer hardware has taught me that a simple cause of problems can be bad connections between computer cards and their slots, or with cables that have come loose. After twenty minutes of tinkering, I figured out how to disengage a few of the ribbon connectors, and I reengaged them. I disconnected and reconnected the battery. I tried plugging in the power connector, and the charge light came on! I was in business!

Innards of my Clie

My backup plan was to purchase a used Clie on ebay. Looks like I won't need to do that unless I want a faster device.

VMWare and Upgrading to Fedora Core 6

I upgraded my desktop machine at work from Fedora Core 5 to Fedora Core 6, and since I run the free VMWare Player (the free VMWare Server is also a fine product), I knew I'd have to get it working after the upgrade. It could have been as simple as running 'vmware-config.pl', but it wasn't.

A known issue with Fedora 6 is that on many single processor systems, the installer loads an i586 kernel instead of an i686 kernel. The workaround for this, at install boot-time, is to type "linux i686" -- except that it only works for fresh installs -- it doesn't work for upgrades. An i586 kernel was installed even though I wanted an i686 kernel, and it created problems when I went to configure VMWare. vmware-config.pl compiles a kernel module against kernel headers. I had installed the kernel-devel package to get the kernel headers. It turns out that I had an i686 kernel-devel package, and it didn't mesh up well with the i586 kernel that I didn't know I had.

Run the following command:
rpm -q --queryformat '%{ARCH} %{NAME}-%{VERSION}-%{RELEASE}\n' kernel kernel-devel

This is how I figured out that I had a mismatch. Here's what I had:
i586 kernel-2.6.18-1.2869.fc6
i686 kernel-devel-2.6.18-1.2869.fc6

Both of those should read 'i686'. Here are the commands to run (as the 'root' user) to resolve the issue:

  1. yum -y upgrade # to get the latest kernel, etc.
  2. Follow the instructions at http://fedoraproject.org/wiki/Bugs/FC6Common to switch to an i686 kernel.
    • yum -y install yum-utils
    • yumdownloader kernel.i686
    • rpm -ivh --replacefiles --replacepkgs kernel-2*.i686.rpm
  3. reboot
  4. yum -y install kernel-devel
  5. rpm -q --queryformat '%{ARCH} %{NAME}-%{VERSION}-%{RELEASE}\n' kernel kernel-devel # The architecture should be i686
  6. touch /usr/src/kernels/2.6.18-1.2869.fc6-i686/include/linux/config.h
  7. vmware-config.pl

Update

I can't recommend upgrading to Fedora Core 6 from version 5. My screensaver (gnome-screensaver) wouldn't unlock -- it never even gave me the chance to enter a password. I tried switching to xscreensaver, but it wouldn't accept my password. After several fruitless google searches for a resolution to either problem, I gave up and decided to install from scratch. Now my screensaver behaves correctly.

When I did a fresh install, it installed the xen kernel. VMware and Xen didn't play well together for me -- I got nearly 100% CPU utilization when I tried to load a guest. I installed the non-xen kernel, booted that kernel, and reconfigured vmware. Now VMware runs great. If I remember correctly, here are the commands I ran as root:

  1. yum -y install kernel
  2. reboot into a non-xen kernel
  3. touch /usr/src/kernels/2.6.18-1.2869.fc6-i686/include/linux/config.h
  4. vmware-config.pl

KVM is the future of virtualization on Linux, from what I gather, so I'm not going to try Xen.