The Cost of Safety

6. February, 2009

Worried about your safety? The safety of your wife/daughter/son/house/car/whatever? If you did worry about something like that in the past, when considering options to make something more safe, did you consider the cost?

Paul Graham wrote a nice essay “Artists Ship” (after the remark by Steve Jobs). Please ignore his “only programmers love to work hard”. The rest of the argument is very convincing. When people talk about “improving” some situation (crime rate, child abuse, revenue streams), they often propose solutions but there is little to no discussion about the cost of said “solution”.

So we want to protect our children against molesters. Fair enough. Only in the discussion, you can’t argue with reason because it’s so emotional. People don’t know anything about the reasons why someone becomes a pedophile or how (and if at all) this can be treated. They want a “solution”, completely ignorant of the cost. It’s a fact that “better” solutions (which will catch more violators) will always harm more innocent people.

Let’s look at a related case. Make up your mind about this case: “Julie Amero, a 40-year-old substitute teacher from Connecticut is facing up to 40 years in prison for exposing her seventh grade class to a cascade of pornographic imagery.” (more). Guilty? Innocent? What’s “exposing” supposed to mean here? Did she show them intentionally? Such a simple case and so many questions …

Say I want to write a program that automatically searches the Internet for child pr0n and sends alerts to the authorities. I can’t. It’s not possible anymore in any western country because I could neither test my program nor use it: Even the download of child pr0n is illegal. It’s illegal before a human can see it. I wonder how all those web filters work … Maybe they build them in a country where child abuse is not illegal.

So you like to watch pr0n but don’t want to pay? The Internet is full of “free” ware. But downloading “good.jpg” might get you into jail, depending on what you might find in the image afterwards. Guilty? Innocent?

Most computers on the Internet are vulnerable to all kinds of attacks. It’s ridiculously simple to spread viruses and worms which effectively take over your computer. Who is guilty when a cracker puts illegal pictures on your PC? You, because you didn’t understand the technology? You, because it is too hard to catch the cracker? You, because the prosecution doesn’t understand the technology, either? You, because the jury can’t follow the explanations of the experts anymore?

On the other hand, a clever pervert might infect his computer deliberately, so he can always say “it was the virus!”. With todays paint software, how hard is it to replace the head of an adult with one of a child and reduce the cup size? How hard is it to prove that the picture is real? How about pencil drawings? You do know that most paint programs come with “artistic filters”.

Such topics tend to become witch hunts where anyone can potentially be as guilty as we want them to be. Justice isn’t blind to protect the successful criminal, she’s blind in order to protect the innocent against prejudice.

So next time, you ask for a new rule, think about the cost, first.

Btw. During the research for this article, I googled for “teacher england hacker child porn“. Condemn me.

Links (in the order in which I stumbled over them):


Distributed Software Development With Git

5. February, 2009

Real Men do it Themselves

There is stuff that changes the way you work. Then, there is stuff that changes the way you think.

When Donald E. Knuth wanted to write a series of books about The Art of Computer Programming, he found himself missing a program to convert his words into a beautiful book. To solve that problem, he invented TeX. When there were no nice fonts around, he added METAFONT. In a similar way, when Linus Torvalds found himself lacking a good version control system (VCS) after Bitkeeper decided to close access for OSS developers, he chose the only solution he had: He wrote his own.

And thus, Git was born and a lot of people living in abuse-protected web forums were in deep trouble. Even before them, the critics soared: What, another VCS?

Subversion vs. Git

Especially the people around Subversion were not so pleased and many people wondered why Linus chose to do his own thing instead of building on existing code. One of the reasons is that Subversion can be thought as a very elaborate bug fix for CVS. It didn’t try to reinvent the wheel.

It also inherits some legacy: You have to setup a central server if you want to do distributed development outside of your LAN. Certain operations are slow, like checkout and update. Agreed, they are faster than CVS but try these with Git. And it’s monolithic software unless you’re willing to use your C compiler. There are only very few ways to interact with the repository from a shell script, only a few hooks to do custom stuff (like sending email). If you just wanted to add a small feature, it would mean real programming work instead of whipping together a quick shell script.

I’m by no means a critic of Subversion; I’m using it every day and I’m happy with it. My point is that it’s confining me in a pretty small box, just a little bit larger than CVS and with less problems. That doesn’t make it larger, though. An example.

Oh, the Pain

You have some files which you want to take home to work on. So you copy them on an USB drive, take them home, edit them, bring them back. When you return to work, a co-worker has changed one of the files. He tells you after you copied all the files from your USB drive back onto your work PC (“Who has time to read all those warnings? Yes to All!”)

The next day, you’re smarter and check in the files into Subversion (SVN). There is no need for a central server and when you ignore the warnings from SVN, you can create the repository on a network drive. When the drive fails in an inopportune moment, your repository will be data trash, but there are certain risks one has to take.

You checkout a copy on your USB drive and take that home. Since working on the file from your USB stick is too slow, you copy everything on your home PC and edit it there. When copying the files back on the USB stick, you notice a lot of write-protected files in .svn directories. Oh well, time for “Yes to All!” again.

After returning to work, you synchronize your checkout with the SVN repository. Life is great. Unless you have Linux at home and were not so careful about Carriage Return/Line Feed conversion and you find the copy of your data on the USB drive is now currupt. But who is using Linux anyway?

The real trouble starts when you feel the need to carry the repository with you. Imagine you have a great idea, you have the USB drive with you, but you’re neither at work nor at home. If you have a computer closeby, you could work on the copy on the USB drive but at the cost of either getting out of sync with your home or work copy.

Subversion, like CVS, only supports a single, central repository unless you use tools like SVK. SVK depends on Perl, though, and it adds nice little … err … rather big cryptic code strings to the commit in messages.

Git

Git, on the other hand, has been built on the “greenfield”. Torvalds could add all the features he wanted and avoid all the common mistakes inherited by the CVS legacy. From a 1000 feet, it’s a set of loosely coupled commands which work on an object database which allows to version objects. Git doesn’t care what an object is, it just versions it. This is pretty similar to SVN, maybe except that Git handles large files better. And that Git is faster for most operations.

The main difference between Git and SVN is that Git is decentralized. This means you can create as many repositories as you want and synchronize them. So in the example above, you can have one repository at work, one on your USB drive and one at home. You can work on all three of them independently and then use Git to figure out how to merge everything.

Remember the dreaded branches from CVS? SVN eased the pain considerably but with Git, everything is a branch to begin with.

To become happy with Git, there are two major steps you need to take. First, you must understand that there is no server. Forget about the idea of server. Git allows to synchronize different copies of a couple of files in different places without a server. To do this effectively, Git keeps some information in the .git directory. If you want do to this remotely, you can use Git as a server, too, but that is basically the same thing as using it locally. Except that the name of a different computer is involved.

The second step is that you don’t put more than one project into one Git repository. With CVS, we are used to use modules. With Subversion, you create subtrees with trunk and branch. With Git, you have one repository per project. Setting up a repository is so cheap, it really doesn’t make sense to have more than one project in it.


Testing the Impossible: Hardware

5. February, 2009

Victor Lin asked on StackOverflow.com:Should I write unit test for everything?

In his case, he wanted to know how to test an application which processes audio: Reads the sound from a microphone, does something with the audio stream, plays the result on the speaker. How can you possibly test a class which reads audio from a microphone? How can you test playing sound on a speaker?

As Steve Rowe pointed out: Use a loopback cable. Play a well defined sound on the speaker and check what the microphone receives.

I suggest to move this test case into a separate test suite so you can run it individually. Print a message to plug in the loopback cable before the test and wait for the user to click “OK”. This is a unit test but not an automated one. It covers the setup steps of the hardware.

The next thing you will want to do is to check the code between the mic and speaker parts. These are now no longer dependent on the hardware and therefore simple to test.


3,337

4. February, 2009

Just a small post to celeberate my StackOverflow reputation of 3,337 🙂


What RAD Tools Are Out There?

4. February, 2009

I’ve just opened a new question on StackOverflow.com: What RAD Tools Are Out There?

If you know a great RAD (Rapit Application Development) tool or platform, leave a comment and earn reputation!


Pair Programming Preferences Dilemma

4. February, 2009

The basic idea behind pair programming is that you have one computer, one keyboard, two heads (one being mostly occupied with typing). There is just one problem, though: If you’re a developer like me, you’re using keyboard shortcuts. Lots of shortcuts. And you will have your own ways to achieve things. Those ways will be different from everyone else in the team. You’re not a clone, are you? Which leads to special shortcuts.

IDEA allows to switch prefs quickly with Ctrl-BackQuote (try that on a German keyboard …) Eclipse has no way to quickly switch the UI prefs (shortcuts, active toolbars/menus).

I’ve filed a new bug to track this.


What’s Wrong With Java: Angst APIs

3. February, 2009

If you’ve been using Java for a while, you will have encountered the need to start an external process. So you use Runtime.getRuntime().exec(...) … and you’re stuck. You can’t set the current directory for the new process, change the environment, etc. And you’ll also have to handle deadlocks.

Bad Java. Down. So Martin Buchholz came up with ProcessBuilder. I’ll use this API to explain my concept of “Angst API”. An Angst API is an API which keeps you afraid. You use it and there is the constant feeling that something might break. The default case (which should work out of the box) is in fact the most hard to make work right.

If you know a bit about processes, you know how easy it is to get a deadlock when one is reading from the other: Process 1 is trying to write some more data to process 2 which is waiting for process 1 to read the data is has sent back a few moments ago. Deadlock.

To avoid this, you need to wrap the output streams (from which your process is reading) in a thread. This is the default case. In the special case, when you know that the processes won’t exchange any data, you don’t need this but in the common case, you do. This is what is broken with the ProcessBuilder API: It makes the special case (no data exchange) simple and the common case hard. It even tries its best to make a fix hard: All classes involved are final, private, the “no trespassing” style.

Which is the other side of the Angst API: We don’t want to bloat the Java runtime, we are afraid that some user might reuse our code, we are afraid that the performance could suffer, we are afraid that error handling is more complex if we start a background thread (one thread would be enough to process the outputs of all external processes).

When you design an API, don’t be afraid. Make it a nice API, one which you love to write, which users love to use and which welcomes them with open arms. No one likes the scary neighbor who waits for trespassers with a gun in his arms.


Stop Auto-Scrolling in SWT When The User Grabs The Scrollbar

3. February, 2009

Imagine you have a console widget in an SWT app. The app is doing some background work and dumps lots of text in the console. Something catches the eye of the user and she desperately tries to keep it in view while the console jumps to the bottom all of the time. Here is a simple solution how to notice that the user has grabbed the scrollbar.

    private AtomicBoolean userHoldsScrollbar
                = new AtomicBoolean ();

        control.getVerticalBar ().addSelectionListener (new SelectionListener () {
            public void widgetDefaultSelected (SelectionEvent e)
            {
                // NOP
            }

            public void widgetSelected (SelectionEvent e)
            {
                if (e.detail == SWT.DRAG)
                    userHoldsScrollbar.set (true);
                else if (e.detail == SWT.NONE)
                    userHoldsScrollbar.set (false);
            }
        });

In your auto-scroll code, check whether userHoldsScrollbar.get() is true.


Improving Performance of The Data Layer

3. February, 2009

Anyone here who is working on an application which doesn’t need to persist its model in a database? You can tune out here.

The rest of you knows the problem: You have a change in your model, you need to persist it. The problem: The persistence layer is synchronous. Your application “hangs” until the change is written to the database and the database has confirmed the commit. The reason for this design is error handling: We are afraid of what might happen if there is an error in a commit for a transaction which we sent to the persistence layer five minutes ago. What’s the user supposed to do? How can the storage thread decide what to do with the following commits? Throw them away? Suppress them until the problem has been fixed five days later? Write them to a file? To name but a few.

If  there is no error, this is not an issue and this works astonishingly good. I implemented such a scheme with roughly 50 lines in my program upcscan. For Java with it’s unhealthy type system, that would be a bit more code, a whole lot harder to get right. Fortunately, the guys at Terracotta offer a support module for that.

Read more in this article: Asynchronous Write Behinds and the Repository Pattern


VEX is Back

2. February, 2009

Do you remember vex (visual editor for XML)? I mentioned the project in my very first blog post two years ago. Development has started again as an Eclipse project. More at VisualEditorForXML.