Archive for the ‘Development’ Category

Why CITS got git

Tuesday, June 23rd, 2009

Subversion is good, don’t get me wrong, git is just better! In a nutshell, git is a fast distributed version control system, which supports a much more flexible style of development than Subversion. The two features that prompted me to switch were its distributed nature and its support for branching.

The distributed part is interesting; instead of checking out “the latest version” then checking in your changes, you make a complete clone of the repository (including history), do whatever you want with it, then merge your changes to wherever, you want, however you want. The main reason this is interesting to me (and the main thing that prompted me to use git) is that it means I can work completely ‘offline’, in that I can view full history, make commits, branch and so on without having access to the server. My server is actually an ordinary desktop PC that dual boots into Windows, so sometimes it’s not available, and being able to still work properly when it’s not on is brilliant. Using Subversion, I would end up building up large changelists whilst waiting to be able to commit, which is a bad thing. Additionally I wouldn’t be able to view history or revert anything to a previous version, which are two things that are needed more often than you’d think.

Git’s branching is also very useful to me. It supports fast and local branching and merging; branching is a lightweight operation, you can have branches that are local to only one repository, and merging is pretty brilliant.

I tend to flit between tasks depending on how I feel when I sit down to do some work. Sometimes I’ll want to do some refactoring, sometimes I want to work on a new feature, sometimes I feel like making some new art. Under Subversion, one way to do this would be to work on all of these in the same working directory and selectively commit parts of that when they are ready. That’s dangerous because it’s then impossible to test changes in isolation – I may end up checking in something that depends on something I’m not checking in. It also becomes a pain when changes overlap, as you then can’t selectively commit anyway.

You could also save changelists and apply them when you want to do some more work, or commit, but then you’re losing many of the benefits of source control, in that each of these features is effectively now one version, and they’re not backed up on a server.

Another way to do it would be to create branches and work within those, but Subversion doesn’t do branches very nicely. Branches are basically just a dumb copy of the whole working directory to another directory, and as such are checked in to the main repository visible to everyone. With git, you can quite easily create and switch to a new branch, do some fiddling, switch to another branch, fiddle some more, merge the two branches and commit the result. This is perfect for me, and is exactly what I do, except with several days between each step! You can commit several times to each branch, you can store work in progress (via git-stash) so you don’t have ugly half-finished commits, and the ‘main’ repository (the central server visible by everyone) needn’t know about any of this. You just push to it once a series of commits that make up some number of features is deemed ‘done’.

Of course there are plenty of other niceties that come with using git, but these were the two most important things to me.

I’ll go into exactly how I use git in my next git post. Stay tuned :)

Migrating from Subversion to git

Sunday, May 17th, 2009

One of the things I’ve been busy with is migrating Cities in the Sky’s source control from Subversion to git. I’ll go over the reasons in full in a later post, but in a nutshell, git supports my workflow better than Subversion since it is distributed and supports fast branching and merging.

Now, there are some good guides dotted around the net which deal with migrating from Subversion to git, and with setting up a git server in a secure and maintainable manner, but I found them lacking a little in friendliness and details. There are also a lot of blog posts that more or less copy those guides and add in nothing of their own, sometimes to cheaply draw visitors to their sites, sometimes “in case I forget where the original is”, and sometimes just for the hell of it. Rather than add that little bit more redundancy to the Internet, I thought I’d write up a friendly explanation about just what these guides are doing, along with some extra advice I discovered to be useful.

Installing git and git-svn

So let’s get started! The first step is creating a git repository from your Subversion repository, which is made pretty easy through the use of git-svn, which is a tool for using git locally with a remote Subversion repository. I’m going to go ahead and assume you’re using some flavour of Linux – I’m using Ubuntu. I believe git-svn is supposed to be installed as part of git, but for whatever reason it isn’t on Ubuntu, so to install git and git-svn I had to do this:

sudo apt-get install git-core git-svn

Converting your repository from Subversion to git

Once you have installed git and git-svn, follow this guide on Simplistic Complexity and you’ll have a local git repository containing the full history of your Subversion repository with all the Subversion users mapped to git users. This bit is pretty easy, so the only piece of advice I have for this step is that in the users file you can add entries even for those ‘no author’ commits – just use the exact string Subversion reports as the author of the commit. For example, my users file looked like this:

Ben = Ben Hymers <my.email@host.com>
ben = Ben Hymers <my.email@host.com>
(no author) = Ben Hymers <my.email@host.com>
root = Ben Hymers <my.email@host.com>

There will often be odd users like these near the beginning of a project’s history, from the initial import or just before proper authentication was set up. Including these lines ‘fixes’ the history for the git repository.

Securely serving your git repository

The next step is to set up your server to serve your new git repository securely, since at the moment it is just on the local filesystem and will only be available to users that have permission to access the directory. Read through the excellent directions in this article on scie.nti.st. If you understood them perfectly then go ahead and use them, if not then read on!

Some background knowledge

It’s probably helpful to mention that a git repository is basically just a directory, which contains metadata and a big load of compressed bits and bobs which comprise the data of your project over its history, and optionally a working copy. If it has a working copy, which is typically the case when you’re a developer doing some work on the repository, the git data will be stored in ‘.git/’ in the root of your project, whilst the working copy will be present as normal. If it doesn’t have a working copy, it is what’s called a ‘bare’ repository, which is typically what a server will use as it has no need to modify anything, and the git data will just be stored straight in the directory. A ‘git server’ then is some process that makes this directory available to others via something other than just the filesystem.

The way we’re going to serve up the git repository is via SSH. Once it’s set up, commands you issue to git that require the address of the repository (clone, push and so on) will cause git to SSH into the server machine, to the directory that is the git repository, to perform their monkey business.

I hadn’t really used SSH before this so it was all quite alien to me, but there are plenty of articles on SSH and public key authentication on the web; I suggest you read one of those to understand why we’re creating keys, how they’re used, and what they mean.

We set up a user called ‘git’ who will own all the repositories. We do this so we can keep the permissions as specific as possible (if a user is compromised, damage is limited) and to avoid having to set up a user on the server and grant them permissions to the appropriate git repositories every time a new external user wants access. Instead, all access to the git repositories will be done via the git user – external users will SSH into the server as user ‘git’.

It’s also handy to know that git repositories are commonly named with the suffix ‘.git’, to show that the directory is a git repository. It’s like a file extension for directories.

Finally, the git documentation says that addresses of the form “[user@]host.xz:path/to/repo.git” are equivalent to addresses of the form “ssh://[user@]host.xz/~/path/to/repo.git”.

So, with those nuggets of knowledge in hand it becomes a bit clearer what lines like ”

git clone git@my_server:my_repository.git

” mean. ‘

git clone

‘ is the command to clone a repository given as the argument. The strange-looking address can actually be reinterpreted as “ssh://git@my_server/~/my_repository.git”. Breaking that apart, we see that we are connecting using the SSH protocol as user ‘git’ to ‘my_server’, in the directory ‘~/my_repository.git’. ‘~’ is unix shorthand for the home directory, and since we are connecting as user ‘git’ this will be git’s home, which is where the repository resides. Isn’t that clever?

Setting up gitosis

Now, what’s this gitosis stuff all about, and how does it help you create and maintain git repositories? Well, you could create the git user, and log in as git to add users’ keys and to create repositories in your home directory yourself, but that would get a bit tedious, and quite tricky when you have multiple repositories that each have different permissions. Gitosis automates all that, and hosts the configuration via git in a clever recursive manner.

Follow the commands in the first section of the article linked above and you’ll have checked out (via git) the gitosis config, ready to create repositories and authenticate users. Note that initially there will only be one user able to access the configuration – the user whose public key you initially gave to gitosis – which is nice and safe. It doesn’t have to be someone local to the server. Configuration is pretty simple – there is a directory for public keys, into which you place keys you collect from users’ machines. Then there is a configuration file, in which you list groups, which consist of users that are members, and permissions to repositories. Repositories are created implicitly – list one as writeable by some group, and after you commit the config changes it will exist and you can start pushing to it.

So, to add a user, get them to generate a public key and give it to you. You then name the key appropriately (e.g. “ben@windowsmachine.pub”) and place it in keydir/. Find or create a group that has the permissions you want the user to have, and add the name of the public key file (without the .pub extension) to the ‘users’ section of that group.

If you want to allow more people access to the configuration, do the same and add them to the default ‘gitosis-admin’ group, which has write access to the gitosis-admin repository, the one you are editing. You won’t want to go doing this too much for security purposes, but I did it to grant myself access from my other machines. Commit your changes and gitosis will update its configuration automagically.

The only advice I need to give on setting up gitosis is to follow the directions exactly, to the letter. A little knowledge is a dangerous thing, and since I have a little knowledge I ended up skipping out some commands or doing them my way since I thought I knew what I was doing, and as a result ended up undoing and redoing most of those steps several times until I realised I’d broken the whole thing and had to start from the very beginning.

Tying it all together

To tie these two steps together – converting your repository from Subersion to git, and hosting it via gitosis – you just need to push your new git repository to one you have listed in your gitosis config. You can either do this by giving the address of the repository to the push command, like so:

git push git@my_server:my_repository.git

Or you can add a ‘remote’ to your git config so push has a ‘default’ place to push to, like so:

git remote rm origin
git remote add origin git@my_server:my_repository.git
git push

We remove the remote ‘origin’ first since it’ll likely already be set to the location of the git-svn repository you cloned from, and there’s little point pushing back to that one! ‘origin’ is as it sounds – where the repository was cloned from – and is the default remote to push to. There are plenty of other rules that determine what ‘default’ means but none will apply unless you’ve done anything other than what’s listed here! Consult the git-remote and git-push documentation for more details.

From then on it’s business as usual, and you can use whatever git workflow you fancy. Since you’ve just converted from Subversion you’re probably used to the checkout -> edit -> update/merge -> commit workflow, in which case you’ll probably want to mimic that until you get used to git and want to try out anything crazy. I found the Git – SVN Crash Course useful for this.

There you have it then; hopefully with the combination of the articles I have linked and the extra explanation of what they are doing I have given, you will be able to migrate from Subversion to git in a secure and maintainable way with ease. Good luck!