What’s Your Mission?

2. November, 2009

There is another nice article from Joel Spolsky: Figuring out what your company is all about. It’s all about

“We help $TYPE_OF_PERSON be awesome at $THING”

So what do you work on and how does it help your customers to be awesome with something? If you can’t answer this simple question, then you should sit down and ponder why not. It will help you to achieve your goals.

There is one point about the article, though. Joel says: “We help the world’s best developers make better software.” Uh … only the best? How about the vast majority, the good ones?


If You Ever Need To Design a Standard

2. November, 2009

… then keep it simple.

If you need a reason for this (other than plain old common sense), see this blog post.


Subtext: Visual Programming, New Angle

2. March, 2009

If you have no idea what subtext is, lean back and watch this presentation.

It’s nice to finally find another person concerned about the state of programming languages. I started with C, toyed a bit with some other languages, moved to Java and today, I’m working mostly with Java, Groovy and Python. I’m doing all my spare-time code in Python. Why Python? Because I get more bang for the key press. And my spare time is most valuable.

So while I thoroughly agree that the idea of subtext is convincing, it’s too limiting at the same time: There are simple problems which you can’t express well in subtext, for example: A switch with 10 cases and some complex code in each case. It would just become too wide. The same applies when formulas have more than ten parameters. Your flow tree looks nice but it takes more screen real estate than the “traditional” version.

So my argument is that we need a way to choose. Software projects need to give up the holy grail of “one language to rule them all.” The IDE should allow to mix and match various languages and more complex “objects” like tables, rich text, animations. Why do I have to waste my time formatting tabular content in a Java file (think array of values) when I can have Excel? Why can’t Java read the data directly from Excel? Why can’t I embed Excel tables in Java source code and access them like a 2D O array? Why can’t I use a rich word processor to write my comments? Why is TAB 1-8 spaces instead of “one level of indent”? Why do I have to use braces when I already indent my code?

Because our computers are not powerful enough today. With every key we press, we have to worry about RAM and performance. Because companies still believe in lock in. Sun would probably add a cross-platform COM API into Java but will Microsoft port Excel to any platform where a Java compiler is available? Oh, we could use OpenOffice. Let’s see. People working for a software development company that has more then two employees: Comment here if your company policy allows to use OO instead of Office. Now let’s see how long it takes to get 10 comments.

In the end, what we have today is the most simple thing that actually works and doesn’t take too much RAM. I hope the time is ripe for the next step. I’m sick of fixed-width fonts, curly braces and source code which is 1% functionality and 99% “make the damn compiler happy”.


Distributed Software Development With Git

5. February, 2009

Real Men do it Themselves

There is stuff that changes the way you work. Then, there is stuff that changes the way you think.

When Donald E. Knuth wanted to write a series of books about The Art of Computer Programming, he found himself missing a program to convert his words into a beautiful book. To solve that problem, he invented TeX. When there were no nice fonts around, he added METAFONT. In a similar way, when Linus Torvalds found himself lacking a good version control system (VCS) after Bitkeeper decided to close access for OSS developers, he chose the only solution he had: He wrote his own.

And thus, Git was born and a lot of people living in abuse-protected web forums were in deep trouble. Even before them, the critics soared: What, another VCS?

Subversion vs. Git

Especially the people around Subversion were not so pleased and many people wondered why Linus chose to do his own thing instead of building on existing code. One of the reasons is that Subversion can be thought as a very elaborate bug fix for CVS. It didn’t try to reinvent the wheel.

It also inherits some legacy: You have to setup a central server if you want to do distributed development outside of your LAN. Certain operations are slow, like checkout and update. Agreed, they are faster than CVS but try these with Git. And it’s monolithic software unless you’re willing to use your C compiler. There are only very few ways to interact with the repository from a shell script, only a few hooks to do custom stuff (like sending email). If you just wanted to add a small feature, it would mean real programming work instead of whipping together a quick shell script.

I’m by no means a critic of Subversion; I’m using it every day and I’m happy with it. My point is that it’s confining me in a pretty small box, just a little bit larger than CVS and with less problems. That doesn’t make it larger, though. An example.

Oh, the Pain

You have some files which you want to take home to work on. So you copy them on an USB drive, take them home, edit them, bring them back. When you return to work, a co-worker has changed one of the files. He tells you after you copied all the files from your USB drive back onto your work PC (“Who has time to read all those warnings? Yes to All!”)

The next day, you’re smarter and check in the files into Subversion (SVN). There is no need for a central server and when you ignore the warnings from SVN, you can create the repository on a network drive. When the drive fails in an inopportune moment, your repository will be data trash, but there are certain risks one has to take.

You checkout a copy on your USB drive and take that home. Since working on the file from your USB stick is too slow, you copy everything on your home PC and edit it there. When copying the files back on the USB stick, you notice a lot of write-protected files in .svn directories. Oh well, time for “Yes to All!” again.

After returning to work, you synchronize your checkout with the SVN repository. Life is great. Unless you have Linux at home and were not so careful about Carriage Return/Line Feed conversion and you find the copy of your data on the USB drive is now currupt. But who is using Linux anyway?

The real trouble starts when you feel the need to carry the repository with you. Imagine you have a great idea, you have the USB drive with you, but you’re neither at work nor at home. If you have a computer closeby, you could work on the copy on the USB drive but at the cost of either getting out of sync with your home or work copy.

Subversion, like CVS, only supports a single, central repository unless you use tools like SVK. SVK depends on Perl, though, and it adds nice little … err … rather big cryptic code strings to the commit in messages.

Git

Git, on the other hand, has been built on the “greenfield”. Torvalds could add all the features he wanted and avoid all the common mistakes inherited by the CVS legacy. From a 1000 feet, it’s a set of loosely coupled commands which work on an object database which allows to version objects. Git doesn’t care what an object is, it just versions it. This is pretty similar to SVN, maybe except that Git handles large files better. And that Git is faster for most operations.

The main difference between Git and SVN is that Git is decentralized. This means you can create as many repositories as you want and synchronize them. So in the example above, you can have one repository at work, one on your USB drive and one at home. You can work on all three of them independently and then use Git to figure out how to merge everything.

Remember the dreaded branches from CVS? SVN eased the pain considerably but with Git, everything is a branch to begin with.

To become happy with Git, there are two major steps you need to take. First, you must understand that there is no server. Forget about the idea of server. Git allows to synchronize different copies of a couple of files in different places without a server. To do this effectively, Git keeps some information in the .git directory. If you want do to this remotely, you can use Git as a server, too, but that is basically the same thing as using it locally. Except that the name of a different computer is involved.

The second step is that you don’t put more than one project into one Git repository. With CVS, we are used to use modules. With Subversion, you create subtrees with trunk and branch. With Git, you have one repository per project. Setting up a repository is so cheap, it really doesn’t make sense to have more than one project in it.


What RAD Tools Are Out There?

4. February, 2009

I’ve just opened a new question on StackOverflow.com: What RAD Tools Are Out There?

If you know a great RAD (Rapit Application Development) tool or platform, leave a comment and earn reputation!


Pair Programming Preferences Dilemma

4. February, 2009

The basic idea behind pair programming is that you have one computer, one keyboard, two heads (one being mostly occupied with typing). There is just one problem, though: If you’re a developer like me, you’re using keyboard shortcuts. Lots of shortcuts. And you will have your own ways to achieve things. Those ways will be different from everyone else in the team. You’re not a clone, are you? Which leads to special shortcuts.

IDEA allows to switch prefs quickly with Ctrl-BackQuote (try that on a German keyboard …) Eclipse has no way to quickly switch the UI prefs (shortcuts, active toolbars/menus).

I’ve filed a new bug to track this.


What’s Wrong With Java: Angst APIs

3. February, 2009

If you’ve been using Java for a while, you will have encountered the need to start an external process. So you use Runtime.getRuntime().exec(...) … and you’re stuck. You can’t set the current directory for the new process, change the environment, etc. And you’ll also have to handle deadlocks.

Bad Java. Down. So Martin Buchholz came up with ProcessBuilder. I’ll use this API to explain my concept of “Angst API”. An Angst API is an API which keeps you afraid. You use it and there is the constant feeling that something might break. The default case (which should work out of the box) is in fact the most hard to make work right.

If you know a bit about processes, you know how easy it is to get a deadlock when one is reading from the other: Process 1 is trying to write some more data to process 2 which is waiting for process 1 to read the data is has sent back a few moments ago. Deadlock.

To avoid this, you need to wrap the output streams (from which your process is reading) in a thread. This is the default case. In the special case, when you know that the processes won’t exchange any data, you don’t need this but in the common case, you do. This is what is broken with the ProcessBuilder API: It makes the special case (no data exchange) simple and the common case hard. It even tries its best to make a fix hard: All classes involved are final, private, the “no trespassing” style.

Which is the other side of the Angst API: We don’t want to bloat the Java runtime, we are afraid that some user might reuse our code, we are afraid that the performance could suffer, we are afraid that error handling is more complex if we start a background thread (one thread would be enough to process the outputs of all external processes).

When you design an API, don’t be afraid. Make it a nice API, one which you love to write, which users love to use and which welcomes them with open arms. No one likes the scary neighbor who waits for trespassers with a gun in his arms.


Testing: Pay in Advance or Afterwards?

2. February, 2009

In a recent post, I talked about people ignoring the cost of some decision. In his blog “Joel on Software”, they talk about the same thing: How easy it is to fall into the “we must have strict rules” trap to protect ourselves against some vague  fear of failure. Only, humans are really bad at sticking to rules. Or are they? Maybe it’s just that reality doesn’t care so much about rules because things change. If you built your castle on the belief how well strong walls will protect you, the swamp around the basement is not going to care. You’re going down, chummer.

So we end up with a lot of rules which make exactly one thing simple: To assign blame. I’ve been working for a big company where we have a strict process how projects were to be set up. There were lots of documents and forms and comittees how to start a project and a lot of documents describing how to end it (put it into production, what documents to file, who to inform, you name it). It was a great process (in the sense of “big”, mind). The actual writing of the code was explained in a document which contained a single page. On that single page, they talked on how they would strive to write excellent, error free code and that they would use a proven strategy, the waterfall model.

They built a huge, shiny castle on nothing.

If you go to a bank and tell them you have lots of $$$ and you need to pay some big bill somewhere in the future, their first question will be: How you want to make that money work for you in the meantime? Just letting it rot under your desk is not very smart, right? You should invest it somewhere, so you will have $$$$$ or even $$$$$$$ when it comes to pay the bill. Which makes sense. Contrary to that, when we write software, we tend to spend our money first instead of parking it in a safe place where it can return some revenue, being ever vigilant to be able to pay as the bills show up. Which is harder than just sitting back and relying on some mythical process someone else has written on a piece of paper a long time ago.

So when you ask: “Should I write tests for all my classes? For every line of code? How should I spend my money?” Then my answer will be: I don’t know. How can I? I know nothing about your project. But I can give you some ideas how to figure it out yourself.

“Should I write tests for all my classes?” That depends on what these classes are meant for. The more low-level the code, the more tests you should have. Rule of thumb: Tests yield more result in the basement. Make sure the ground you’re building on is sound. And behaves as you expect. The upper levels are mostly built from lego bricks. They are easy to take apart and reshape. They are exchangable, so you can get away with fewer tests. But every bug in the foundation will cripple anything above it.

“For every line of code?” No. Never. 1. It’s not possible. 2. Maintaining the tests will cost more than the real code. 3. Tests are more simple than the real code but you still make a constant amount of mistakes per lines of code. So this will only drive the number of bugs through the roof. 4. Strict, fixed rules never work (note the paradox).

“How should I spend my money?” One word: Wisely. Wisely means to think about your specific problem and find the unique solution. Do you know in advance how much each piece will cost? No. So the best you can do is a staggered approach: Invest a bit of money, check how it plays out. If it works well, spend more. If it doesn’t, scratch it, learn, try something else. Which you will be able to do since you didn’t put all your money on a single horse.

So what if your three month venture into agile development didn’t really work out? All you lost is three months. Other projects are deemed a “success” after going over budget by 100%, using twice the time that was estimated (and none of them were shorter than a year). But you will still have learned something. You paid for it, that wisdom is yours.

Use it wisely.


A Different View on Exceptions

30. January, 2009

The discussion about checked/unchecked exceptions is almost as old as Java. While we all have a point in your stance towards this, maybe we are looking at the problem from the wrong angle. Manuel Woelker wrote an article which concentrates on the receiver of the exception, the user, and how exceptions should behave to help the user: Exceptions From a User’s Perspective.

In a nutshell:

[…]the error message displayed to the user should also explain what can be done to correct the situation

Take this code:

Foo foo = cache.get( key );
Preconditions.checkNotNull( foo, "foo is null" );

This error message is wasting someone’s time. Use this instead:

Foo foo = cache.get( key );
Preconditions.checkNotNull( foo, "No Foo for %s", key );

See? That actually tells you why foo is null.

Can we do better than that? Yes, we can:

Foo foo = cache.get( key );
Preconditions.checkNotNull( foo,
    "No Foo for %s. Valid keys: %s",
    key, cache.keySet() );

As you can see, by adding just a little bit of extra information, you can prevent someone (maybe even yourself) from starting the debugger, trying to reproduce the bug and then trying to pry some useful hints of the cause from the runtime.


How to Hide a Virus in Source Code

28. January, 2009

I’ve been looking for quite some time for this article: How can you hide a virus in the source code? Basically, you create a binary of a compiler which contains the virus and which is patched to infect other programs as it compiles them. This is a feature of bootstrapping a compiler.

Reflections on Trusting Trust by Ken Thompson.