Building Eclipse from Git

16. February, 2011

Andrew Niefer blogs about Building Eclipse from Git. Unfortunately, he doesn’t explain how to do that if you’re not a committer (i.e. have a user on eclipse.org).

I’m still hoping that one day, it will be possible for people outside the Eclipse team, to be able to build Eclipse projects.


Confused by DVCS? Joel can help

22. March, 2010

In his last article, Joel talks how DVCS confused him and how he solved the problem. One sentence in particular should be noted:

these systems think in terms of changes, not in terms of versions.

Still confused? Read his HgInit tutorial to get you up to speed with Mercurial (or any other DVCS because they are all pretty similar at that level).

PS: I prefer Mercurial to Git for twothree reasons:

1. I need a working DCVS, not a toolbox to build one. I prefer it when a smart guy has given all the hidden issues some thought, so I don’t have to.

2. There is a simple, working Windows installer.

3. It’s written in Python.


Distributed Software Development With Git

5. February, 2009

Real Men do it Themselves

There is stuff that changes the way you work. Then, there is stuff that changes the way you think.

When Donald E. Knuth wanted to write a series of books about The Art of Computer Programming, he found himself missing a program to convert his words into a beautiful book. To solve that problem, he invented TeX. When there were no nice fonts around, he added METAFONT. In a similar way, when Linus Torvalds found himself lacking a good version control system (VCS) after Bitkeeper decided to close access for OSS developers, he chose the only solution he had: He wrote his own.

And thus, Git was born and a lot of people living in abuse-protected web forums were in deep trouble. Even before them, the critics soared: What, another VCS?

Subversion vs. Git

Especially the people around Subversion were not so pleased and many people wondered why Linus chose to do his own thing instead of building on existing code. One of the reasons is that Subversion can be thought as a very elaborate bug fix for CVS. It didn’t try to reinvent the wheel.

It also inherits some legacy: You have to setup a central server if you want to do distributed development outside of your LAN. Certain operations are slow, like checkout and update. Agreed, they are faster than CVS but try these with Git. And it’s monolithic software unless you’re willing to use your C compiler. There are only very few ways to interact with the repository from a shell script, only a few hooks to do custom stuff (like sending email). If you just wanted to add a small feature, it would mean real programming work instead of whipping together a quick shell script.

I’m by no means a critic of Subversion; I’m using it every day and I’m happy with it. My point is that it’s confining me in a pretty small box, just a little bit larger than CVS and with less problems. That doesn’t make it larger, though. An example.

Oh, the Pain

You have some files which you want to take home to work on. So you copy them on an USB drive, take them home, edit them, bring them back. When you return to work, a co-worker has changed one of the files. He tells you after you copied all the files from your USB drive back onto your work PC (“Who has time to read all those warnings? Yes to All!”)

The next day, you’re smarter and check in the files into Subversion (SVN). There is no need for a central server and when you ignore the warnings from SVN, you can create the repository on a network drive. When the drive fails in an inopportune moment, your repository will be data trash, but there are certain risks one has to take.

You checkout a copy on your USB drive and take that home. Since working on the file from your USB stick is too slow, you copy everything on your home PC and edit it there. When copying the files back on the USB stick, you notice a lot of write-protected files in .svn directories. Oh well, time for “Yes to All!” again.

After returning to work, you synchronize your checkout with the SVN repository. Life is great. Unless you have Linux at home and were not so careful about Carriage Return/Line Feed conversion and you find the copy of your data on the USB drive is now currupt. But who is using Linux anyway?

The real trouble starts when you feel the need to carry the repository with you. Imagine you have a great idea, you have the USB drive with you, but you’re neither at work nor at home. If you have a computer closeby, you could work on the copy on the USB drive but at the cost of either getting out of sync with your home or work copy.

Subversion, like CVS, only supports a single, central repository unless you use tools like SVK. SVK depends on Perl, though, and it adds nice little … err … rather big cryptic code strings to the commit in messages.

Git

Git, on the other hand, has been built on the “greenfield”. Torvalds could add all the features he wanted and avoid all the common mistakes inherited by the CVS legacy. From a 1000 feet, it’s a set of loosely coupled commands which work on an object database which allows to version objects. Git doesn’t care what an object is, it just versions it. This is pretty similar to SVN, maybe except that Git handles large files better. And that Git is faster for most operations.

The main difference between Git and SVN is that Git is decentralized. This means you can create as many repositories as you want and synchronize them. So in the example above, you can have one repository at work, one on your USB drive and one at home. You can work on all three of them independently and then use Git to figure out how to merge everything.

Remember the dreaded branches from CVS? SVN eased the pain considerably but with Git, everything is a branch to begin with.

To become happy with Git, there are two major steps you need to take. First, you must understand that there is no server. Forget about the idea of server. Git allows to synchronize different copies of a couple of files in different places without a server. To do this effectively, Git keeps some information in the .git directory. If you want do to this remotely, you can use Git as a server, too, but that is basically the same thing as using it locally. Except that the name of a different computer is involved.

The second step is that you don’t put more than one project into one Git repository. With CVS, we are used to use modules. With Subversion, you create subtrees with trunk and branch. With Git, you have one repository per project. Setting up a repository is so cheap, it really doesn’t make sense to have more than one project in it.


Try Git

31. December, 2007

I’m not cutting edge. I was a long time ago but today, I prefer mature stuff. That’s why I cultivate the luxury to ignore brand new, cool software until it’s not only cool but most of the wrinkles have been ironed out by those guys who like to bleed (it’s not called “cutting edge” for nothing).

So when Subversion came out, I stayed with CVS until version 1.2. I installed it and liked it immediately. Now, some years later, even the Eclipse plugins (Subclipse and Subversive) work so well that using them becomes uninterruptive.

The same is true for GIT. When Linus (of Linux fame) announced it, I knew that this guy was probably smart enough to know when to start a new version control system and when to stay with the flock. Alas, I never had the pressure to try it.

Actually, I had the pressure but I didn’t notice because I didn’t know enough about GIT, specifically what it does different than, say, Subversion (a.k.a “the worlds largest patch for CVS”). In nutshell, GIT is a set of commands to manage one or more repositories with the same or different versioned objects.

Confused? An example: You work at a paranoid company that won’t allow you to connect to some OSS CVS/SVN server to update the files of a project which you’re using in your daily work. Sad but quite common. Let’s ignore for a minute the relative dangers of using CVS (which you aren’t allowed to use) and using HTTP (which you can use). What options do you have?

You can download the files at home, put them on a USB drive and take them to work. That works pretty well. Now, one day, you make a change to the sources at home and at work. Only, you forget to commit one of them. So you keep updating the sources and eventually, you will run into problems merging the files because of that one uncommitted change.

This happens because neither CVS nor SVN allow you to “clone” the source repository. GIT works differently: When you start, you either create an empty repository or you copy an existing one. When you do that, you get everything. If that other repository gets corrupted, it could be restored completely from your copy. (Remember? “Backups are for wimps; real man put their software on public FTP servers and let the world mirror them!”)

Also, GIT will track the work in each clone repository in an individual branch (unlike SVN where everyone usually works on the same branch). When you checkout the latest version in your working directory, you don’t work on the same branch as any other developer. If you commit something, that’s in your local branch only. If you break something, no one but you will notice.

Only when you’re satisfied with your work, you “push” your changes over to other people. You needn’t even push it to everyone at once. You can send your changes to a few colleagues first, before pushing them to the main project repository for everyone to see. Or you can keep it. GIT will happily download all the branches from the main repository without disturbing yours unless you say “Hey, I want to merge to main development into my code.”

At that time, you will notice something: GIT will ask you to commit all your work first if you have uncommitted files in your working directory. In SVN this isn’t possible or at least not that simple; you’d have to setup a SVN branch for each merge you do. For GIT, it’s simple because from its point of view, the changes are not related in the first place. True, the do have a common ancestor somewhere but after that, they are as independent as they can be.

So you have the great luxury to be able to commit all changes you made, saving all your work before you do the actual merge. You can even save stuff away that you don’t want to merge, so every modified file after that is actually part of the merge. This also means if something goes wrong, you can simply roll back all changes by checking out the head of your local development branch. Even better, you can create better diffs. It’s simple to create a diff for a) everything you changed (diff common-ancestor local-branch-head), b) everything everyone else changed (common-ancestor global-branch-head) and c) everything you changed to do the merge (diff local-branch-head working-copy). In SVN, you usually see a mix of a) and c), at least.

So, if you usually work from several places and can’t keep a network connection to the central server all the time, give GIT a try when you start your next project. It’s worth it.