Safer Java: Constants First

22. January, 2008

Here is a simple change in your Java development style that will save you a lot of time: When comparing something against a constant, always put the constant first. Examples:

    if (0 == x)...

    public final static String RUN = "run";
    if (RUN.equals (mode))...

That will look strange at first because we’re used to have the constants on the right hand side (from assigns). So what’s the advantage of this? There are three:

  1. It will save you a lot of NullPointerException when using equals().
  2. It’s more readable in big if/else “switch-alikes”, when you compare a variable against a lot of values.
  3. It avoids the accidental assignment as in if (x = 0) (which should have been if (x == 0) … and if you can’t see the difference between the two, you really should always do it this way!) because if (0 = x) generates a compile time error.

Mixins in Java

14. November, 2007

Mixins are powerful programming concept in dynamic languages because they allow you to implements aspects of classes in different places and then “plug” them together. For example, the “tree” aspect of a data structure (something having parents and children) is well understood. A lot of data can be arranged in hierarchic trees. Yet, in many languages, you cannot say:

   class FileTreeNode extends File mixin TreeNode

to get a class which gives you access to all file operations and allows to arrange the items in a tree at the same time. This means you can’t directly attach it to a tree viewer. In some languages, like Python, this is trivial since you can add methods to a class any time you want. Other languages like C++ have multiple inheritance which allows to do something like this. Alas, not at runtime.

For Java, the Eclipse guys came up with a solution: adapters. It looks like this:

    public  T getAdapter (Class desiredType)
    {
        ... create an adapter which makes "this" behave like "desiredType" ...
    }

where “desiredType” is usually an interface of some kind (note: The Eclipse API itself is still Java 1.4, this is the generics version to avoid the otherwise necessary cast).

How can you use this?

In the most simple case, you can just make the class implement the interface and “return this” in the adapter. Not very impressive.

The next step is to create a factory which gets two bits of information: The object you want to wrap and the desired API. On top of that, you can use org.eclipse.core.internal.runtime.AdapterManager which allows to register any number of factories for adapters. Now, we’re getting somewhere and the getAdapter() method could look like this:

    @SuppressWarnings("unchecked")
    public  T getAdapter (Class desiredType)
    {
        return (T)AdapterManager.getDefault ().getAdapter (this, desiredType);
    }

This allows me to modify the behavior of my class at runtime more cheaply and safely than using the Reflection API. Best of all, the compiler will complain if you try to call a method that doesn’t exist:

    ITreeNode node = file.getAdapter(ITreeNode.class);
    ITreeNode parent = node.getParent(); // okay
    node.lastModification(); // Sorry, this is a file method

Try this with reflection: Lots of strings and no help. To implement the above example with an object that has no idea about trees but which you want to manage in a tree-like structure, you need this:

  • A factory which creates a tree adapter for the object in question.
  • The tree adapter is the actual tree data structure. The objects still have no idea they are in a tree. So adding/removing objects will happen in the tree adapter. Things get complicated quickly if you have some of the information you need for the tree in the objects themselves. Think files: You can use listFiles() to get the children. This is nice until you want to notify either side that a file has been created or deleted (and it gets horrible when you must spy on the actual filesystem for changes).
  • The factory must interact with the tree adapter in such a way that it can return existing nodes if you ask twice for an adapter for object X. This usually means that you need to have a map to lookup existing nodes.

A very simple example how to use this is to allow to override equals() at runtime. You need an interface:

interface IEquals {
    public boolean equals (Object other);
    public int hashCode ();
}

Now, you can define one or more adapter classes which implement this interface for your objects. If you register a default implementation for your object class, then you can use this code in your object to compare itself in the way you need at runtime:

    public boolean equals (Object obj)
    {
        return getAdapter (IEquals.class).equals (obj);
    }

Note: I suggest to cache the adapter if you don’t plan to change it at runtime. This allows you to switch it once at startup and equals() will still be fast. And you should not try to change this adapter when the object is stored as a key in a hashmap … you will have really strange problems like set.put(obj); ... set.contains(obj) -> false etc.

Or you can define an adapter which looks up a nice icon depending on the class or file type. The best part is that you don’t have API cross-pollution. If you have a file, then getParent() will return the parent directory while if you look at the object from the tree API, it will be a tree node. Neither API can “see” the other, so you will never have to rename methods because of name collisions. ITreeNode node = file.getAdapter(ITreeNode.class) also clearly expresses how you look at the object from now on: as a tree. This makes it much more simple to write reliable, reusable code.


Downloading Sources for Maven2 Projects

25. September, 2007

If you ever needed to download the source artifacts for a Maven2 project, there were several options: If you use the Eclipse plugin (either the Eclipse plugin for Maven2 or the Maven Plugin for Eclipse, also called “Maven Integration for Eclipse”), you can use the option -DdownloadSources=true from the command line or enable the “Download sources” preferences option.

Both have some drawbacks. When you use the Maven Plugin in the Eclipse IDE, it will try to download the sources all the time which blocks the workflow. If you just want to do it once, you have to enable and disable the option all the time plus you have to force the plugin to start the download (with 0.12, you can add and delete a space in the POM and save it). But it will only download the sources for a single project.

If you use the command line version to download the sources, it will overwrite the .project file and modify your preferences, etc. Also not something you will want.

There are two solutions. One would be the “-Declipse.skip=true” option of the command line plugin. Unfortunately, this option doesn’t work. It will prevent the plugin from doing anything, not only from writing the project files.

So if you have a master POM which includes all other projects as modules and all your projects are checked into CVS, you can run:

mvn eclipse:eclipse -DdownloadSources=true

to download all sources and then restore the modified files from CVS. I’ve opened a bug which contains a patch for the skip option. After applying it, -Declipse.skip=true will just skip writing the new project files but still download the source artifacts.


Unit Testing jsp:include

19. August, 2007

If you’re stuck with some JSPs and need to test them with MockRunner, you’ll eventually run in the problem to test jsp:include. MockRunner doesn’t come with built-in support for this, but this itch can be scratched with a few lines of code:

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServlet;

import com.mockrunner.mock.web.MockRequestDispatcher;
import com.mockrunner.servlet.ServletTestModule;

/**
 * Allow to use jsp:include in tests.
 */
public class NestedMockRequestDispatcher extends MockRequestDispatcher
{
    public ServletTestModule servletTestModule;
    public HttpServlet servlet;

    public NestedMockRequestDispatcher (
            ServletTestModule servletTestModule,
            Class servletClass)
    {
        this.servletTestModule = servletTestModule;
        servlet = servletTestModule.createServlet(servletClass);
    }

    @Override
    public void include (ServletRequest request, ServletResponse response)
            throws ServletException, IOException
    {
        servlet.service(request, response);
    }
}

In your test case, add this method:

    public void prepareInclude(Class servletClass, String path)
    {
        NestedMockRequestDispatcher rd = new NestedMockRequestDispatcher (createServletTestModule(), servletClass);

        getMockRequest().setRequestDispatcher(path, rd);
    }

The path is absolute but without the servlet context. So if the included JSP is named “foo.jsp” and the context is “/webapp”, then path is "/foo.jsp". If that doesn’t work, print the result of getMockRequest().getRequestDispatcherMap() after the test and you’ll see the paths which are expected.

All that’s left is to call this method in setUp() for all JSPs that you need to test. If you forget one, the jsp:include just won’t do anything (i.e. you won’t get an error). To make sure you don’t miss any includes (especially ones which you lazy co-workers added after you wrote the test), I suggest that you check the map after the test run for entries which aren’t instances of NestedMockRequestDispatcher.


Testing BIRT

23. July, 2007

I’m a huge fan of TDD. Recently, I had to write tests for BIRT, specifically for a bug we’ve stumbled upon in BIRT 2.1 that has been fixed in 2.2: Page breaks in tables.

The first step was to setup BIRT so I can run it from my tests.

public IReportEngine getEngine () throws BirtException
{
    EngineConfig config = new EngineConfig();
    config.setLogConfig("/tmp/birt-log", Level.FINEST);
    
    // Path to the directory which contains "platform"
    config.setEngineHome(".../src/main/webapp");
    PlatformConfig pc = new PlatformConfig ();
    pc.setBIRTHome(basepath);
    PlatformFileContext context = new PlatformFileContext(pc);
    config.setPlatformContext(context);
    
    Platform.startup(config);
    
    IReportEngineFactory factory = (IReportEngineFactory) Platform
    .createFactoryObject(IReportEngineFactory
        .EXTENSION_REPORT_ENGINE_FACTORY);
    if (factory == null)
 throw new RuntimeException ("Couldn't create factory");
    
    return factory.createReportEngine(config);
}

My main problems here: Find all the parts necessary to install BIRT, copy them to the right places and find out how to setup EngineConfig (especially the platform part).

public void renderPDF (OutputStream out, File reportDir,
        String reportFile, Map reportParam) throws EngineException
{
    File f = new File (reportDir, reportFile);
    final IReportRunnable design = birtReportEngine
        .openReportDesign(f.getAbsolutePath());
    //create task to run and render report
    final IRunAndRenderTask task = birtReportEngine
        .createRunAndRenderTask(design);
    
    // Set parameters for report
    task.setParameterValues(reportParam);
    
    //set output options
    final HTMLRenderOption options = new HTMLRenderOption();
    options.setOutputFormat(HTMLRenderOption.OUTPUT_FORMAT_PDF);
    options.setOutputStream(out);
    task.setRenderOption(options);
        
    //run report
    task.run();
    task.close();
}

I’m using HTMLRenderOption here so I could use the same code to generate HTML and PDF.

In my test case, I just write the output to a file:

public void testPageBreak () throws Exception
{
    Map params = new HashMap (20);
    ...
    
    File dir = new File ("tmp");
    if (!dir.exists()) dir.mkdirs();
    File f = new File (dir, "pagebreak.pdf");
    if (f.exists())
    {
 if (!f.delete())
     fail ("Can't delete "+f.getAbsolutePath()
         + "nMaybe it's locked by AcrobatReader?");
    }
    
    FileOutputStream out = new FileOutputStream (f);
    ReportGenerator gen = new ReportGenerator();
    File basePath = new File ("../webapp/src/main/webapp/reports");
    gen.generateToStream(out, basePath, "sewingAtelier.rptdesign"
        , params);
    if (!f.exists())
 fail ("File wasn't written. Please check the BIRT logfile!");
}

Now, this is no test. It’s only a test when it can verify that the output is correct. To do this, I use PDFBox:

    PDDocument doc = PDDocument.load(new File ("tmp", "pagebreak.pdf"));
    // Check number of pages
    assertEquals (6, doc.getPageCount());
    assertEquals ("Error on page 1",
            "...n" + 
            "...n" +
     ...
            "..."
            , getText (doc, 1));

The meat is in getText():

private String getText (PDDocument doc, int page) throws IOException
{
    PDFTextStripper textStripper = new PDFTextStripper ();
    textStripper.setStartPage(page);
    textStripper.setEndPage(page);
    String s = textStripper.getText(doc).trim();
    
    Pattern DATE_TIME_PATTERN = Pattern.compile("^dd.dd.dddd dd:dd Page (d+) of (d+)$", Pattern.MULTILINE);
    Matcher m = DATE_TIME_PATTERN.matcher(s);
    s = m.replaceAll("23.07.2007 14:02 Page $1 of $2");
    
    return fixCRLF (s);
}

I’m using several tricks here: I’m replacing a date/time string with a constant, I stabilize line ends (fixCRLF() contains String.replaceAll("rn", "n");) and do this page by page to check the whole document.

Of course, since getText() just returns the text of a page as a String, you can use all the other operations to check that everything is where or as it should be.

Note that I’m using MockEJB and JNDI to hand a datasource to BIRT. The DB itself is Derby running in embedded mode. This allows me to connect to directly a Derby 10.2 database even though BIRT comes with Derby 10.1 (and saves me the hazzle to fix the classpath which OSGi builds for BIRT).

@Override
protected void setUp () throws Exception
{
    super.setUp();
    MockContextFactory.setAsInitial();
    
    Context ctx = new InitialContext();
    MockContextFactory.setDelegateContext(ctx);
    
    EmbeddedDataSource ds = new EmbeddedDataSource ();
    ds.setDatabaseName("tmp/test_db/TestDB");
    ds.setUser("");
    ds.setPassword("");

    ctx.bind("java:comp/env/jdbc/DB", ds);
}

@Override
protected void tearDown () throws Exception
{
    super.tearDown();
    MockContextFactory.revertSetAsInitial();
}

Links:


What’s Wrong With Java Part 2b

23. July, 2007

To give an idea why I needed 5KLoC for such a simple model, here is a detailed analysis of Keyword.java:

LoC Used by
43 Getters and setter
40 XML Import/Export
27 Model
27 equals()/hashCode()
21 Hibernate mapping with annotations
14 Imports
2 Logging
174 Total

As you can see, boiler plate code like getter/setters and equals() need 70LoC or 40% (48% if you add imports). Mapping the model to XML is more expensive than mapping it to a database. In the next installment, we’ll see that this can be reduced considerably.

Note: This is not a series of articles about flaws in Hibernate or the Java VM, this is about the Java language (ie. what you type into your IDE and then compile with javac).


What’s Wrong With Java Part 2

21. July, 2007

OR Mapping With Hibernate

After the model, let’s look at the implementation. The first candidate is the most successful OR mapper combination in the Java world: Hibernate.

Hibernate brings all the features we need: It can lazy-load ordered and unordered data sets from the DB, map all kinds of weird relations and it lets us use Java for the model in a very comfortable way: We just plain Java (POJO‘s actually) and Hibernate does some magic behind the scenes that connects the objects to the database. What could be more simple?

Well, an OO language which is more dynamic, for example. Let’s start with a simple task: Create a standalone keyword and put that into the DB. This is simple enough:

// Saving <tt>Keyword</tt> in database
Keyword kw = new Keyword();
kw.setType (Keyword.KEYWORD);
kw.setName ("test");

session.save (kw);

(Please ignore the session object for now.)

That was easy, wasn’t it? If you look at the log, you’ll see that Hibernate sent an INSERT statement to the DB. Cool. So … how do we use this new object? The first, most natural idea, would be to use the object we just saved:

// Saving <tt>Knowledge</tt> with a keyword in the database
Knowledge k = new Knowledge ();
k.addKeyword (kw);

session.save (k);

Unfortunately, this doesn’t work. It does work in your test but in the final application, the Keyword is created in the first transaction and the Knowledge in the second one. So Hibernate will (rightfully) complain that you can’t use that keyword anymore because someone else might have changed it.

Now, what? You have to ask Hibernate for a copy of every object after you closed the transaction in which you created it before you can use it anywhere else:

   1:
   2:
   3:
   4:
   5:
   6:
   7:
   8:
   9:
  10:
  11:
Keyword kw = new Keyword();
kw.setType (Keyword.KEYWORD);
kw.setName ("test");

session.save (kw);
kw = dao.loadById (kw.getId ());

Knowledge k = new Knowledge ();
k.addKeyword (kw);

session.save (k);

How to save Knowledge with a keyword in the database with transactions

Why do we have to load an object after just saving it? Well … because of Java. Java has very strict rules what you can do with (or to) an object instance after it has been created. One of them is that you can’t replace methods. So what, you’d think. In our case, things aren’t that simple. In our model, the name of a Knowledge instance is a Keyword. When you look at the code, you’ll see the standard setter. But when you run it, you’ll see that someone loads the item from the KEYWORD table. What is going on?

   1:
   2:
   3:
public void setName (Keyword name) {
    this.name = name;
}

setName() method

Behind the scenes, Hibernate replaces this method by using a proxy object, so it can notice when you change the model (setting a new name). The most simple soltuion would be to replace the method setName() in session.save() with calls the original setter and notifies Hibernate about the modification. In Python, that’s three lines of code. Unfortunately, this is impossible in Java.

So to get this proxy objects, you must show an object to Hibernate, let it make a copy (by calling save()) and then ask for the new copy which is in fact a wrapper object that behaves just like your original object but it also knows when to send commands to the database. Simple, eh?

Makes me wonder why session.save() doesn’t simply return the new object when it is more safe to use it from now on … especially when you have a model which is modified over several transactions. In this case, you can easily end up with a mix of native and proxy objects which will cause no end of headache.

Anyway. This approach has a few drawbacks:

  • If someone else creates the object, calls your code and then continues to do something with the original object (because people usually don’t expect methods to replace objects with copies when they call them), you’re in deep trouble. Usually, you can’t change that other code. You loose. Go away.
  • The proxy object is very similar but not the same as the original object. The biggest difference is that it has a different class. This means, in equals(), you can’t use this.getClass == other.getClass(). Instead, you have to use instanceof (the copy is derived from the original class). This breaks the contract of equals() which says that it must be symmetric.
  • If you have large, complex objects, copying them is expensive.
  • After a while, you will start to write factory methods that create the objects for you. The code is always the same: Create a simple object, save it, load it again and then return the copy. Apart from cut&paste, this means that you must not call new for some of your objects. Again, this breaks habits which leads to bugs.

All in all, the whole approach is clumsy. Really, it’s not Hibernate’s fault but the code is still ugly, hard to maintain (because it breaks the implicit rules we have become so used to). In Python, you just create the object and use it. The dynamic nature of Python allows the OR mapper to replace or wrap all the methods as it needs to and you never notice it. The code is clean, easy to understand and compact.

Another problem are the XML config files. Besides all the issues with Java XML parsers, it is always problematic to store the same information in two places. If you ever change your Java model, you better not forget to update the XML or you will get strange errors. You can’t refactor the model classes anymore because there is code outside the scope of your refactoring tool. And let’s not forget code completion which works pretty good for Java. Not so for XML files. If you’re lucky, someone has written a code completion for your type of XML config. Still, there will be problems. If there is a new version, your code completion will lag behind.

It’s like regexp: Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. — Jamie Zawinski

Fortunately, Sun solved this problem with JPA (or at least eased the pain). JPA allows to use annotations to store the mapping configuration in the class file itself. Apart from a few small problems (like setting up everything), this works pretty well. Code completion works perfectly because any IDE which has code completion will be able to use the latest and greatest version of your helper JARs without any customization. Just drop the new JAR in your classpath and you’re ready to do. Swell.

But there are more problems:

  • You must create a session object “somewhere” and hand it around. If you’re writing a webapp, this better be thread-safe. Not to mention you must be able to override this for tests.
  • The session object must track if you have already started a transaction and nest them properly or you will have to duplicate code because you can’t call existing methods if they use transactions.
  • Spring and AOP will help a lot but they also add another layer of complexity, you’ll have to learn another API, another set of rules how to organize your code, etc.
  • JAR file-size. My code is 246KB. The JARs it depends on take … 6’096KB, more than 40 times of my code. And I’m not even using Spring.
  • Even with JPA, Hibernate is not simple to use because Java itself is not simple to use.

In the end, the model was 5’400 LoC. A added a small UI to it using SWT/JFace which added 2’400 LoC.

If you look at the model in the previous installment, then the question is: Why do I need 5’000 LoC to write a program which implements an OR mapper for a model which has only three classes and 26 lines of code?

Granted, test cases and helper code take their toll. I could accept that this code needs four or five times the size of the model itself. Still, we have a gap.

The answer is that there are no or bad defaults. For our simple case, Hibernate could guess everything. Java could generate all the setters and getters, equals() and hashCode(). It’s no black magic to figure out that Relation has a reference to Knowledge so there needs to be a database table which stores this information. Sadly, defaults in Java are always “safe” rather than “clever”. This is the main difference to newer languages. They try to guess most of the stuff and then, you can fix those few exceptions that you always have. With Java, all the exceptions are handled but you have to do everyday stuff yourself.

The whole experience was frustrating, especially since I’m a seasoned Java developer. It took me almost two weeks to write the code for this small model mostly because because of a bug in Hibernate 3.1 and because I couldn’t get my mind around the existing documentation. Also, parent-child relations were poorly documented in the first Hibernate book. The second book explains this much better.

Conclusion: Use it if you must. Today, there are better ways.

Next stop: TurboGears, a Python web framework using SQL Objects.


What’s Wrong With Java, Part 1

11. July, 2007

As I promised at the Jazoon, I’m starting a series of posts here to present the reasons behind my thoughts on the future on Java. To do this, I’ll develop a small knowledge management application with the name Sensei, the Japanese word for “teacher”. The final goal is to have three versions in Java, Python an Groovy at the same (or at least a similar) level.

The application will sport the usual features: A database layer, a model and a user interface. Let’s start with the model which is fairly simple.

We have knowledge nodes that contain the knowledge (a short name and a longer description), keywords to mark knowledge nodes and relations to connect knowledge nodes. Additionally, we want to be able to organize knowledge nodes in a tree. Here is the UML:

To make it easier to search the model for names, I collect them in the keyword class. So a knowledge node has no String field with the name but a keyword field instead. The same applies to the relations. Here is what the Java model looks like:

   1:
   2:
   3:
   4:
   5:
   6:
   7:
   8:
   9:
  10:
  11:
  12:
  13:
  14:
  15:
  16:
  17:
  18:
  19:
  20:
  21:
  22:
  23:
  24:
  25:
  26:
class Keyword {
    public enum Type {
        KEYWORD,
        KNOWLEDGE,
        RELATION
    };

    static Set<Keyword> allKeywords;
    
    Type   type;
    String name;
}

class Knowledge {
    Keyword         name;
    String          knowledge;
    Knowledge       parent;
    List<Knowledge> children;
    Set<Keyword>    keywords;
}

class Relation {
    Keyword   name;
    Knowledge from;
    Knowledge to;
}

UML of Sensei Model

Note that I omitted all the usual cruft (private fields, getters/setters, equals/hashCode).

This model has some interesting properties:

  1. There is a recursive element. Many OR mappers show lots of example how to map a String to a field but for some reason, examples with tree-like structures are scarce.
  2. It contains 1:N mappings where N can be 0 (keywords can be attached to knowledge nodes but don’t have to be), 1:N mappings where N is always >= 1 (names of knowledge nodes and relations) and N:M mappings (relations between knowledge nodes).
  3. It contains a field called “from” which is an SQL keyword.
  4. There are sorted and unsorted mappings (children and keywords of a knowledge node).
  5. Instances of Keyword must be garbage collected somehow. When I delete the last knowledge node or relation with a certain name, I want the keyword deleted with it. This is not true for pure keywords, though.

To summarize, this is probably the smallest meaningful model you can come up with to test all possible features an OR mapper must support.

Let’s have a look at the features we want to support:

  1. Searching knowledge nodes by name or keyword or relation
  2. Searches should by case insensitive and allowing to search for substrings
  3. Adding child nodes to an existing knowledge node
  4. Reordering children
  5. Moving a child from one parent to another
  6. Adding and removing relations between nodes
  7. Adding and removing keywords to a knowledge node

In the next installment, we will have a look how Java and Hibernate can cope with these demands.


What’s Wrong With … Exceptions

29. June, 2007

Do you know what checked exceptions are for? They are one of the main features of Java, yet many people don’t how when to use checked or runtime exceptions. Sadly, this is not only true for beginners, who are regularly confused by this topic, but also Java professionals. Moreover, when the topic comes up, a heated discussion flares up between the pro and the against groups. I’ll call them the vi and emacs groups, because they behave so similarly.

The vi group wants to get things done. Their tools must be simple, solid, always available. They despise checked exceptions because they feel they get in the way. The Emacs group, on the other hand, likes the comfy, all-encompassing tools, that read your lips and bring you coffee. For them, checked exceptions are a way to build more robust software.

I’m in the vi group. I’ve never bought the idea that checked exceptions make software more solid just as I don’t believe that laws reduce crime, but that’s another story. First, let us look at what checked exceptions really are. For the sake of your sanity, I omit java.lang.Error.

When Java was developed, C++ was all the hype. C++ had this thing called exceptions which could be used to report unusual (= exceptional) states. The Java designers looked at them and found that you could use them for two purposes: Error and exception handling. Errors are, unlike exceptions, expected states. When I open a file and the file doesn’t exist, that’s an error. When a user is asked for some info and that info is obviously wrong (text instead of a number), that’s an error. Errors happen outside the scope of your program. They are likely to happen and it makes sense to handle the error by, for example, showing a dialog to the user so she can solve the problem and there is a high chance that she can actually do it.

Exceptions are unexpected. They happen within the scope of the program but there is probably nothing you can do anymore after the problem happened. Null pointers are the standard example. When the program tries to follow a reference and that reference is null, what can it do? Obviously, something must be broken elsewhere but more often than not, there is nothing that could be done to fix the problem when it happens. Except maybe display a death message and crash.

The Java designers decided to invent checked exceptions to denote errors, illegal and unexpected states outside of the scope of your program. Let’s look at what they came up with:

FileNotFoundException is checked (correct).
IOException (harddisk dies) is checked, too (correct but not very intuitive: What can the program possibly do when this happens?).
SQLException which can mean your DB connection just died because the DB server was rebooted (error) or you tried to insert data twice (exception) is checked, so it’s right sometimes, sometimes, it’s wrong; Oracle folds some 50’000 errors into SQLException; some are exceptions, some are errors according to the defintion.
IllegalArgumentException (when you pass wrong arguments to a method) is not checked (correct).
ClassCastException is not checked (correct).
ClassNotFoundException is checked; I believe this is wrong since the classpath is within the scope of the program and there is not a big chance that recovery is possible.

So as you can see, even the designers of the language were also confused when to use errors or exceptions (or they dreaded the amount of code that would have to be written if they had used an error). But it goes on: Checked exceptions must be handled. If you don’t do anything, the compiler will stubbornly refuse compile your code even though nothing is wrong with it. The code would work! It’s not like a syntax error or something. The compiler just bullies you. So especially beginners (who have to juggle all these many new things in their heads) are challenged how to handle them. Some add the necessary throws statements to their methods until they have 20 of them in the method definition and they give up. Here are a few common patterns what happens next:

void foo1() throws Exception { ... }

void foo2() {
  try { ... } catch (Exception ignore) { }
}

void foo3() {
  try { ...
  } catch (Exception e) {
    e.printStackTrace();
  }
}

void foo4() {
  try { ...
  } catch (Exception e) {
    throw new RuntimeException(e);
  }
}

void foo5() {
  try { ...
  } catch (Exception e) {
    throw new RuntimeException("Something was wrong");
  }
}

There are many problems with these approaches:

  1. foo1() hides which exceptions can be thrown, so there is no way to handle them higher up even if you could. The compiler won’t allow you to add specific catch statements (because these exceptions aren’t thrown anymore), you will have to catch Exception and then use instanceof and cast. Ugly at best.
  2. foo2() just swallows all exceptions, runtime and checked. They are just gone! The code will start to behave erratically and there is no way to figure out why except using FindBugs or your eyeballs. Have you ever tried to find such a bug in 20’000 lines of code?
  3. foo3() seems to be an improvement but it isn’t: Instead of having a notification when a problem happened, someone (a human) has to control the output of the code manually. This doesn’t get better when the exception is sent to some logging tool, because they usually also don’t notify someone on their own (who wants to get 10’000 emails if something breaks in a loop?), so I omitted this case.
  4. foo4() is most often used but it’s no solution, either: All exceptions are again folded into one (so you can’t say anymore which could happen and react accordingly) and RuntimeExceptions are wrapped (because they extend Exception) even though that is unnecessary.
  5. foo5() is my favorite. “Something” is wrong. Really? What? How do I find out after the fact? How do I test that my bugfix in fact fixes the problem?

If you think only beginner make these mistakes, then search the JDK. You can find plenty of examples for each of these. It’s not a mistake, it’s the standard response of developers when they use a feature which is not well-understood.

What can be done?

Many of the Java frameworks gave up on checked exceptions and converted all their code to use runtime exceptions instead. Spring and Hibernate are the most prominent examples, especially since they had such a huge codebase to fix. They probably put a lot of thought into that before investing countless hours into rewriting all the exception handling code and imposing the same work on all their clients.

Maybe this is the best solution since Sun will probably not modify their compiler (and it’s only the compiler that needs to be fixed! The Java runtime, the VM, does not care about missing catch and throws statements). Unfortunately, this leaves a large gap. Personally, I don’t hate checked exceptions, I hate that the compiler insists on knowing when I can handle them. I also hate that the compiler hides half (or more) of the exceptions that could be thrown in a given piece of code from me. I have to compile the code in my head to get that information which is wrong for any reason. In both cases, the compiler (or the guy who wrote it) makes assumptions on what I can do where in my code or tries to impose limitations on me that make sense for him.

What I would like to see is this: The compiler should collect all the exception information it can get and add that to the throws clause in the .class file (you can add runtime exceptions to the throws clause, you knew that, didn’t you?). There are very strict rules what Java code can throw where, so this should be hard but not impossible. True, this will never be perfect but I’m only asking for best effort here. Next, it should not complain when an exception is not handled anywhere. Sounds like a sure recipe for a painful death but bear with me.

In the IDE, I would like to have a tool which allows me to see and filter all these exceptions. I would like to see all SQLExceptions that my DB code can throw. Ideally, I would like to see all the possible error codes as well (I’m dreaming here but you get the idea).

The goal is to have tool support for handling any number (from 0 to 100%) of exceptions that could occur based on my needs. There is a big difference whether I whip up some run-once tool for myself that scratches an urgent itch or whether I write a framework or code for a nuclear power plant.

Right now, the compiler is doing the wrong thing for everyone. That is bad.

Moreover, when you’re doing Test Driven Development (like I do and you should), then having the compiler enforce error handling becomes a real nuisance since your tests will check both the correct and the error cases. In that regard, the error handling in your code will always be the best that could be there. And if you look at this from the economical side, there is no point in handling an error that never happens (even if it could): The more rarely an error occurs, the more it becomes an exception.

More info: Java’s checked exceptions were a mistake” by Rod Waldhoff and “Removing Language Features?” by Neil Gafter. These are just two examples of many you can find on the net. If you want to see a reason why nobody gets it right, have a look at Chapter 11: Exceptions of the Java Language Specification or “Unchecked Exceptions – The Controversy“.


Back from JaZOOn, Fourth and Last Day

28. June, 2007

In the morning, Neal Gafter gave some insight into “Adding Closures to the Java Programming Language“. You remember anonymous classes? Well, closures are similar but solve all the problems associated with it: You can access and modify variables from the surrounding method, for example. You can use closures to replace all the listeners in Swing. Look at this code:

 InputStream is = createStream();
 try {
    doSomething(is);
 } finally {
    try { is.close(); }
    catch (IOException e) {
       log.warn("Error closing", e);
    }
 }

What do we have here? The actual information is the “doSomething()”. That’s where the interesting stuff happens. Everything else is boiler plate code. Now imagine you must change the logging. You have copied this code a thousand times. The stream has often a different name or it’s a Reader. You can bet that you’ll forget to make the change in at least one place. Okay, it’s just logging, but how often have you seen this pattern repeated? A piece of code where the meat is embedded deeply into some other Java code and the whole thing is duplicated with cut&paste all over the place. That’s what closures are for:

 with (InputStream is : createStream()) {
    doSomething(is);
 }

with() is a method which takes two arguments: An InputStream and a Closure. The code for with() looks somewhat like this:

 void with(InputStream is, {InputStream => void} block) {
     try {
         block.invoke(is);
     } finally {
         try { is.close(); }
         catch (IOException e) { log.warn("Error closing", e); }
     }
 }

As you can see, the code wrapping the closure is now in one place. The same pattern can be used when reading objects from a database (in this case, with() could handle the opening and closing of connections, statements and ResultSet’s). Neal showed examples how to use this in listeners or how to have a loop in the with() method. In this case, “break” in the closure can exit the loop (if you have read enough objects from the database or enough files from the file).

Great stuff. Late but great.

After him, Danny Coward showed how Java will evolve into Java SE 7 and Java EE 7. Not many surprises here. Sun aims for a 18 to 24 month release cycle. That is should give the expert groups enough time to come up with great and stable features. We’ll see about that. I would prefer if they came up with useful features that made developing software easier. For example, annotations in Java 5 could have been a great feature but they can’t modify the code. Makes sense from a compiler point of view but castrates the feature. And don’t get me started with generics. I couldn’t even name a feature of Java 6 that I had seen and that I feel helps me in my daily work.

Enough of that. Where does the big world spin to? Henry Story showed Web 3.0 (and Web 2.7 a.k.a Freebase). The “Semantic Web”. Every information annotated with an URI to tell what it is and how it is related to another information. The whole world seen in 3-tuples. Unfortunately, tools aren’t there, yet. If you don’t want to use VI and cwm, there is not much to chose from. You can try Protégé, Swoop or a commercial product TopBraid Composer. All this looks very promising, especially in relation to JCR/Apache Jackrabbit. Jackrabbit allows to store structured and unstructured data and manage it (search, modify, import, export). This is one step above relational databases which offer only very limited capabilities when it comes to data which is only partly structured or not structured at all (like texts, images and videos). If you’re lucky, you can store this types of data but searching it? Forget it.

The semantic web (SW) looks at the problem from the other side: It allows to annotate data with types, so you can know (or rather, your programs can know) if two “things” are the same, similar or related. Example: You are a person. You have a name. In the SW world, “you” is an object or entity of type “Person”. Your name is a value of type “Person:name”. If you know someone, you can attach the relation “knows” to the entity “you” and then refer to other person objects. Since the code walking this graph knows that it’s traversing persons, it knows that there are names somewhere so it can display them on the screen when you list all the people that you know, etc. For more information, see Henry Story’s BabelFish blog. Maybe start here.

He explained many more details and answered questions in the long Q&A session after his talk. I wished there had been more of these. Many talks raised interests but 5 minutes were never enough to ask any complicated questions.

The talk “JCR, the Content Repository API for Java” wasn’t very interesting for me because it mainly focused on the API. Peeter Piegaze showed different node types and typical Java code. Therefore, I didn’t attend “Content Management with Apache Jackrabbit” but “Development of a 3D Multiplayer Racing Game“. I had hoped for some insight into the problems Evangelos Pournaras had ran while developing the game or details of the physical simulation. Instead, he listed the features and showed the UML diagram of the game. I was very disappointed.

Afterwards, the conference closed with the Jazzon Jam. Over the several days, the moderator had told us about something called “Lighning Talks”. People were supposed to walk on stage, get the mike for two minutes and talk about whatever they want. I was a bit suspicious of the concept but it works. Speakers don’t have time to dally around (setting up the laptop counts against the two minutes!) and get their point over quickly. A nice way to close a conference.

All in all, a positive experience. If possible, I’ll attend the Jazoon’08.