Sunday, April 26, 2009

Mercurial

Google Code is going to add Mercurial support soon, in fact it's already available for some pilot projects. I'm more familiar with Git though, so I needed a good translation guide and I found the Git HG Rosetta Stone. That isn't bad, but there are still some gaps, so I thought I'd share my experience fiddling around with Mercurial. To try and pad out some of those gaps. This is a big long dull technical entry. Sorry.

The first thing I tried was to convert a small Git repository to Mercurial. The way to do this is to use the convert extension.

Now a bit of a rant about extensions here. This goes for Mercurial, but I think it applies to Bazaar too. Anyway, the issue I have is that in order to activate extensions in Mercurial, you have to add a line to a file in your home directory called .hgrc. This seems a bit anti intuitive to me. If these extensions are available in the core, and they are installed and everything, then why do I have to activate them by changing a preferences file? It's probably to keep the core set of commands simple or something, but it adds needless complexity to the whole process.

The documentation on the conversion page mentions converting from CVS, Subversion and Darcs, but there is no mention of converting from Git. I wonder if this is because nobody would really want to convert from Git to something else? ;-) Anyway, the conversion was easy for a trivial repository, just run "hg convert /path/to/repo".

This creates a new directory repo-hg that contains your shiny new Mercurial-converted repository. The directory is initially empty, except for the .hg directory. That's equivalent to the .git directory in a git repository. The odd thing here is that running "hg status" on this empty directory reports no changes - compare this to a "git status" in the same situation, where it would say to you "hey, you've deleted all of your files!", and it seemed a little strange. You have to run "hg checkout" to get all your files back in the directory, which is pretty much the same as the "git reset --hard" idiom that you'd use in the same situation.

Converting my Bunjalloo repository wasn't as easy. I've used a few branches, and some of the history came from subversion originally and there are strange tag/branches there. Mercurial doesn't seem to like this at all and just creates a load of branch-less "things". There's probably a good way to do this, but it would need some investigation.

After compiling and running "hg status" I had a load of unknown files, marked with things like "? build/foo.o". I tried the simplest solution: "mv .gitignore .hgignore". This almost worked, but it game me the following strange error:

could not start inotify server: /tmp/repo-hg/.hgignore: invalid pattern (relre): *.[oa]
abort: /tmp/repo-hg/.hgignore: invalid pattern (relre): *.[oa]

By default Mercurial uses Python's regular expression syntax, not globs. That probably makes some sense, it means it was less effort to code I bet. The error message about inotify is a strange red herring though. The fix was to add a "syntax: glob" line to the file.

After this I made a change in my code to see how Mercurial handles the normal workflow. Git's diff command shows differences in colour. Well, that's not quite true - by default it doesn't, you have to set a preference by running something like "git config color auto". Mercurial does colour too, you have to edit the .hgrc file adding "hgext.color=".

All these properties and preferences can either go in your global $HOME/.hgrc file or in the repository-local .hg/hgrc file. There's no core "hg config" command, but there is an extension to do it. It is a second class citizen that isn't shipped with the core Mercurial and requires an extra not-core-Python library (I suspect this is why it isn't in the default Mercurial). In git you can run the core command "git config --global" to add a config option to your global file, or by default it changes the repository local configuration in the .git directory. It makes things a bit more newbie friendly than editing strange dot files. I can't imagine an ex-Subversion user being comfortable editing a file below .hg, for example. In subversion you just don't go futzing about in the .svn directories.

Once all that was done, I decided to commit the change I had made. Here hg doesn't use git's staging area. That's fine. It is something that works well, but a lot of people don't get it. Mercurial uses the classic approach to this - pass what you want to commit on the command line - which works too. What I do miss here from git is the ability to run "commit -v". With the -v option, you get shown the patch that you are committing as well as the file names. This saves running "$VCS diff" in a separate shell, which is what I always ended up doing with Subversion. A definite regression from git here.

After committing a change or two, I sometimes run gitk to see what is going on overall. Mercurial ships "hgk" which is a really ancient version of gitk tuned for Mercurial commands instead of Git ones. Unfortunately it isn't turned on by default and, you guessed it, you have to edit the .hgrc file. Not only that, but the examples given in the Mercurial wiki are not entirely correct, at least when you install from source. They all state silly paths for hgk.py - the real location is $PREFIX/lib/python$VERSION/site-packages/hgext/hgk.py, where PREFIX is usually /usr/local and VERSION is the Python version, probably 2.5. But even if you set that in your .hgrc, "hg view" doesn't work. You need to copy the file "hgk" from the source contrib directory into your path. This is quite awkward. If something ships with the core, it should be installed and Just Work (TM).

Sadly hgk is not very as good as the current gitk. It's missing a lot of the search options, the "you are here" yellow ball on the current
checkout, status bars for slower updates, and probably some other features that I've forgotten for the moment. Running help says "About gitk" and shows that it is version 1.2 (C) 2005 - the current gitk must be about version 1.80 now and is in active development. The hgk equivalent is about 4 years behind and appears to be more or less abandoned.

The other graphical interface that git has is the "git gui" command. This is really needed for managing the index well, adding chunks of patches and so on. The index is handled automatically in Mercurial, so that's a big chunk of usage that you wouldn't need a gui for. There is an extension - hg record - that emulates the "git add -i" behaviour for adding partial patches. A hg gui would be handy for using that recording extension, if nothing else, but there's nothing included by default. The only likely candidate is hgct, which has not been updated for a couple of years.

Another command I use a lot is "git rebase -i". This lets you squash up commits, change the order of unrelated commits, drop patches, break
up monolithic changesets into smaller ones, and so on. It is really good and easy to use. You run "git rebase -i someid" and get back a
list of commits to do stuff with, something like this:

pick 84d3267 Add the key release waits back in
squash 63656f4 This seems to work on desmume at least
pick d34b4f3 WIP, damn sprites do not show up...
edit c9b6718 Mostly works

# Rebase 7c1db7e..c9b6718 onto 7c1db7e
#
# Commands:
# p, pick = use commit
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
#
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.
#

Mercurial doesn't have rebase, but on the other hand it does have the incomprehensible queues system. I've read the documentation, but I just don't get it at all. You need to remember a whole bunch of new Q commands, and you need to initialise the repository to tell it that you want to use queues. Then you can add, push and pop patches. I dunno. It just feels too heavyweight and requires too much planning up front.

The rebase command was added to the git core late in the game, but it integrates seamlessly with everything else (except submodules, but
git submodules have other problems anyway). You don't need to plan to use git-rebase, you just do it. The visualisation is done in your editor - you don't have to push and pop patches to get them in the right order. You just move them about as lines of text. If it all goes wrong, you can run "rebase --abort" and you don't get screwed over by having all these strange extra patch commits, which is what MQs tends to do.

Now to push things to a new bitbucket account, since Google Code doesn't let everyone use Mercurial yet. Creating the empty repository on Bit Bucket was dead simple. Pushing was equally simple - hg push http://bitbucket.org/quirky/repo. You enter a username and password, which is far easier than the git equivalent, where you have to generate a ssh key, perhaps having to fiddle with ~/.ssh/config to tell it that SSH should use the user "git" when it connects to the github server, and so on. I can see how this would be simpler for someone used to Subversion over HTTP. The Mercurial documentation on the push command seems easier to follow than the git equivalent, too.

Running that line will just push changes to the remote repository - it doesn't set up the remote branch, the equivalent of a "git remote add origin git@github.com:user/repo.git". To do that you have to edit .hg/hgrc and add a "[paths]" section pointing to the remote URL. The convention here is to set the URL to a value "default". i.e.

[paths]
default = http://example.com/repo

To see which changes have not been pushed to the remote repo with git, you can use gitk or git log --decorate. Both of these show where the remote head is and where the local one is. By inspection you can tell which commits are missing from the remote branch. The equivalent in Mercurial is to use "hg outgoing", which tells you which changes have yet to be pushed. Ah, another difference for the gitk/hgk list here: hgk doesn't show the remote branches at all. Anyway, Mercurial's outgoing command is handy. You could mostly emulate it in git using "git log origin/master..HEAD", but you need to know the name of the remote branch, which is surprisingly non-trivial in some cases.

Another feature of Git that I tend to use is for creating patches to send to other people. I mostly use this together with git-svn. That way I can clone a Subversion repository, make my local commits and then send patches of those commits off in bug reports or to mailing lists or whatever. This is "git-format-patch" and "git-send-email". Mercurial has a single command for that, "hg email". It's pretty much the same, but also sends email by default (like git-send-email). Sometimes you need to attach patches to HTML forms rather than emailing them, and so generating files is my preferred approach. By default git generates one file per commit - in fact I think that is the only way it works- Mercurial on the other hand only ever generates one file, with multiple patches inside. A nice option in Mercurial is "-o", which automatically generates the patches that are not upstream on the remote branch. Anyway, the command lines are subtly different, and the output is similar-but-not-the-same. Git has its single-patch approach, where you get out what you put in, plus it is less irritating in the number of questions it asks you (none). Mercurial creates a patch-bomb mbox and asks you for a To: email address, even if you don't want to email the file to anyone. It is probably a personal preference thing, but I like the way Git does it better here. They are both usable and scale better than "svn diff > ~/whatever.patch".

Another gotcha that caught me out was that "hg add" without arguments adds all untracked files for commit. This is like "git add .", but normally commands without arguments don't do naughty things. I didn't like the default behaviour here, it was too surprising. Especially if your hgignore file is not set up right. To fix it, you have to run "hg revert -a", which forgets about the added files.

Summary

Here's a big ol' list with the subtle, and not-so-subtle differences that bugged me. I really wanted to use a table, but blogger doesn't do tables without adding a bunch of blank lines...

Git gitignore
Hg .hgignore, syntax: glob to get the same behaviour as git

Git .git/config, ~/.gitconfig, use git-config to modify the values
Hg .hg/hgrc, ~/.hgrc, hg config is a non-core extension

Git git commit -v
Hg hg diff | less; hg commit

Git gitk
Hg hg view - there is some set up involving .hgrc and placing hgk in the path.

Git git rebase
Hg Mercurial Queues, kind of.

Git git push URL ; git remote add origin URL
Hg hg push URL; $EDITOR .hg/hgrc ; [paths] default = URL

Git gitk, git log origin/master..HEAD
Hg hg outgoing

Git git format-patch RANGE
Hg hg email -m filename -o

Git git gui
Hg Nothing equivalent. There's "hg record" for "git add -i", but it is less user friendly.

Git git add . ; Note the dot
Hg hg add ; No dot needed. Take care with that! You'll have to run "hg revert" to fix the mess

The main arguments for using Mercurial over Git are better Windows support and ease of use. I can't comment on how well either work on Windows, but for Linux Mercurial worked more or less as expected. The second argument about hg being much easier than git... I don't know. Maybe. I'm tainted now that I more or less grok DVCS. But given the difficulty in setting up most of the core plugins and the lack of good core GUIs (gitk/git-gui), I disagree. The basic work flow is more or less the same, but Git has this batteries included feel while Mercurial is more limited. Mostly by default, by enabling optional extensions there isn't that much real difference. Besides, I think that for someone used to Subversion who has never used DVCS, either tool would be overwhelming at first.

What I do know is that the GUI tools that come with Git are a huge help for getting up to speed with what is going on "under the hood". Seeing all the commits and their hierarchy in gitk is far easier than reading the log or viewing the gitweb page. Especially if you start doing rebases and history changing stuff. Maybe Mercurial's one-branch approach and lack of history-munging commands makes all these extras unnecessary. Who knows. Time will probably tell.

Either way it's time to get stuck in to learning Mercurial. I'll definitely start using Mercurial when Google Code offers it to everyone, because the current github/GC integration works but it's a bit ad-hoc. It's missing automatic links from issues to commits, the source tab looks a bit wrong, the Updates feed doesn't update, little things like that. I like the Github UI for code - you get the last change right there on the front page - but for non coders, I think it could be a bit daunting. Google Code's interface is cleaner and easier to navigate. Also subversion is missing features like real tags (no the /tags directory is not the same) and a couple of projects I've got on GC still use the default Subversion just because it's easier to set up and everything. Mercurial will be a nice bonus feature.
Reblog this post [with Zemanta]

3 comments:

  1. Anonymous11:11 am

    Two comments:
    1) mercurial does have an included rebase extension.
    2) there is (at least working on POSIX systems) a decent GUI for record called crecord.

    ReplyDelete
  2. Yeah, it does. But it is nothing like the git-rebase--interactive command. And the crecord extension is not a default one - you have to download it, hope that it is compatible with the current version... and it is still less good than git-gui. I dunno.

    Mercurial is a bit disappointing. It feels like a hobbled, manual version of git to me. I think I finally grokked MQs. It's like having to manually create and maintain git's reflog.

    I really want to like Mercurial - it seems easier to hack on the code side - but I honestly can't see what end-user advantages it has over git. Easier to set up and faster http transfer? Less fanboyism?

    Meh. Tarballs and `cp -r mydir mydir2` FTW.

    ReplyDelete
  3. Anonymous10:28 am

    I came from the other side (Mercurial to Git) and one slight advantage when you are a noob is that Mercurial is a bit 'pay as you go': as you get familiar and confortable with the basics, you enable extensions, try more sophisticate options, etc. Git is more powerful, but a bit more confusing as the beginning (multiple binaries, command names a bit different from CVS/SVN (ex: checkout), staging area, n-way merges, etc.). I see Mercurial as a nice introduction to DVCS and Git.

    ReplyDelete

Note: only a member of this blog may post a comment.