Monday, July 20, 2009

Migration to Mercurial

Liquid Metal, VeniceImage by ms sdb via Flickr

I've taken the plunge and decided to integrate all of the Bunjalloo code onto the Google Code site. This meant I migrated from the original Github-hosted Git repository to make use of the new Mercurial support that GC added not so long ago.

The main reason I migrated was to get all the issue tracking, wiki and code changes on to one site. I really had 2 choices: migrate the main site to Github and use the issues and wiki on there, or migrate the code to Google Code and use Mercurial. I quite like Github and I think it is amazing that there are so many brilliant free hosting services, but I really prefer the Google Code interface from a user's point of view. It is generally less cluttered.

So what does this change get us? First off, we lose github's famous "social coding" features. I had one fork in 2 years, with 0 additional commits. No great loss there then. You can still email patches to the discussion list of course, and of course with either DVCS the author information is retained. I gain the fact that I can now reference changes a bit easier in bug reports and close tickets straight from commit messages. Assuming I even write any more code or fix any bugs ;-)

I had a couple of pet peeves with Mercurial, mostly that I really missed gitk's features. Luckily I discovered what I assume everyone else must use: Tortoise HG. This provides "hgtk log" on Linux that works a lot better than the "hg view" history viewer, which is a port of a really ancient version of gitk. The most important feature that hgtk has is a refresh button so I don't have to keep killing and restarting the application each time I make some changes.

Some other areas that I saw as weaknesses were to do with Mercurial's history fiddling, or lack thereof. However I decided that I'm probably better off not messing with history too much anyway. Lately I've tended to avoid doing that in Git and I get more done, even if the revision log is not as clean as it could be. I finally understood the way to do this using Mercurial Queues anyway, even if it is a bit more fiddly.

My final missing feature was the "commit -v" feature, which shows you the patch of the commit that you are making without having to open up a separate console. This hasn't been fixed, but I've worked around it by writing a Vim script to do something similar. Pressing "K" shows the diff of the current tree in a new buffer. This actually works out pretty well as I can see a patch and write a comment at the same time, rather than having to jump back to the top as I did with the "commit -v" thing.

To do the actual Git to Mercurial conversion I used the "hg convert" extension that is included with core Mercurial. That worked flawlessly and made switching really easy. The conversion guide on the Google Code support site has detailed steps on what to do when converting from Subversion, but I'll describe a few gotchas that I found with the Git to Hg transition.

The recently released Mercurial 1.3 included a tiny patch that I wrote to generate a slightly nicer log for git conversions, so I used that version. You see Git tracks both the author of a patch and the person who made the actual commit, but Mercurial only tracks the "user". The user is equivalent to Git's "committer" by default, while author information is assumed to be the same dude. When you ran hg convert, it added a line like "committer: A.Committer " to the Mercurial log message for every commit, even if the author and committer were one and the same in the original git repository. It looked a bit silly when 99.9% of the commits were my own. So my patch made sure that the "committer" line only got added if the author of the original change was not the same as the commiter of the change.

Another interesting niggle: Git has 2 different types of tag; lightweight and annotated. Annotated tags can also be gpg signed, but that wasn't the case in my repository. The difference between lightweight and annotated tags is pretty subtle. As far as I understand it, a lightweight tag is simply a reference to a commit ID, while an annotated tag also has its own blob in the Git database.

By default "git tag mytag HEAD" will create a tag of the lightweight variety. This is apparently the Wrong Thing To Do and one of the few places that Git's default behaviour is not the best option. You really should pass the "-a" option to create an annotated tag. Suffice to say that I used the default lightweight type of tags for quite a while until I discovered my mistake. The "hg convert" extension doesn't convert lightweight tags at all, it only converts the chunky annotated kind. This is possibly by design (maybe by misunderstanding?) as it would be easy to fix the convert extension to convert either type of tag.

The easiest workaround for me was to just convert my git lightweight tags to annotated tags in the source git repository using the "--force" option to overwrite the old ones. The convert process picked these up and converted them over correctly. Interestingly enough Justin Williams had posted about a similar problem and his timing was perfect to ask it over on

Now that I've used DVCS a bit more and the novelty of branching has worn off, I decided that I wanted the minimal number of heads in my new Mercurial repository. I also wanted to maintain as much of the released history as possible. Luckily the history was mostly linear. I did create a couple of branches for maintenance releases of Bunjalloo early on, but after about version 0.4 I just made releases from the trunk.

Originally the repository was in Subversion and I pulled in the tag branches with git-svn too. This lead to a few branch stubs with a single commit ("creating tag blah") with a corresponding git tag that I must have created at some point later on. I used the Mercurial Queues extension to trim these out of the history where applicable so that the final repo has just 2 heads - the main trunk and an old, closed maintenance branch from the 0.3 days.

Oh, when you install Mercurial from source on Ubuntu (possibly on any Debian derivative?) it rather inexplicably creates an /etc/mercurial/hgrc file that enables all of the extensions. This lead me to (re)discover a bug with the inotify extension when used in conjunction with Mercurial Queues. My solution was to simply disable the inotify extension (in fact just removing the /etc/mercurial directory and enabling what you need in $HOME/.hgrc is a better idea overall).

Anyway, feel free to check out the code and send me your patches to fix all of those open issues! :-)

1 comment:

  1. Anonymous5:09 pm

    Congratulations for migrating to mercurial! Great choice.

    For viewing th history of a repo, you can also try out hgview as an nice alternative to “hg view”, quicker, nicer UI, a few extra functionnalities :


Note: only a member of this blog may post a comment.