Traits for Groovy/Java

25. June, 2009

I’m again toying with the idea of traits for Java (or rather Groovy). Just to give you a rough idea if you haven’t heard about this before, think of my Sensei application template:

class Knowledge {
    Set tags;
    Knowledge parent;
    List children;
    String name;
    String content;
}
class Tag { String name; }
class Relation { String name; Knowledge from, to;

A most simple model but it contains everything you can encounter in an application: Parent-child/tree structure, 1:N and N:M mappings. Now the idea is to have a way to build a UI and a DB mapping from this code. The idea of traits is to implement real properties in Java.

So instead of fields with primitive types, you have real objects to work with:

    assert "name" == Knowledge.name.getName()

These objects exist partially at the class and at the instance level. There is static information at the class level (the name of the property) and there is instance information (the current value). But it should be possible to add more information at both levels. So a DB mapper can add necessary translation information to the class level and a Hibernate mapper can build on top of that.

Oh, I hear you cry “annotations!” But annotations can suck, too. You can’t have smart defaults with annotations. For example, you can’t say “I want all fields called ‘timestamp’ to be mapped with an java.sql.Timestamp“. You have to add the annotation to each timestamp field. That violates DRY. It quickly gets really bad when you have to do this for several mappers: Database, Hibernate, JPA, the UI, Swing, SWT, GWT. Suddenly, each property would need 10+ annotations!

I think I’ve found a solution which should need relatively few lines of code with Groovy. I’ll let that stew for a couple of days in by subconscious and post another article when it’s well done 🙂


What’s Wrong With Java Part 2

21. July, 2007

OR Mapping With Hibernate

After the model, let’s look at the implementation. The first candidate is the most successful OR mapper combination in the Java world: Hibernate.

Hibernate brings all the features we need: It can lazy-load ordered and unordered data sets from the DB, map all kinds of weird relations and it lets us use Java for the model in a very comfortable way: We just plain Java (POJO‘s actually) and Hibernate does some magic behind the scenes that connects the objects to the database. What could be more simple?

Well, an OO language which is more dynamic, for example. Let’s start with a simple task: Create a standalone keyword and put that into the DB. This is simple enough:

// Saving <tt>Keyword</tt> in database
Keyword kw = new Keyword();
kw.setType (Keyword.KEYWORD);
kw.setName ("test");

session.save (kw);

(Please ignore the session object for now.)

That was easy, wasn’t it? If you look at the log, you’ll see that Hibernate sent an INSERT statement to the DB. Cool. So … how do we use this new object? The first, most natural idea, would be to use the object we just saved:

// Saving <tt>Knowledge</tt> with a keyword in the database
Knowledge k = new Knowledge ();
k.addKeyword (kw);

session.save (k);

Unfortunately, this doesn’t work. It does work in your test but in the final application, the Keyword is created in the first transaction and the Knowledge in the second one. So Hibernate will (rightfully) complain that you can’t use that keyword anymore because someone else might have changed it.

Now, what? You have to ask Hibernate for a copy of every object after you closed the transaction in which you created it before you can use it anywhere else:

   1:
   2:
   3:
   4:
   5:
   6:
   7:
   8:
   9:
  10:
  11:
Keyword kw = new Keyword();
kw.setType (Keyword.KEYWORD);
kw.setName ("test");

session.save (kw);
kw = dao.loadById (kw.getId ());

Knowledge k = new Knowledge ();
k.addKeyword (kw);

session.save (k);

How to save Knowledge with a keyword in the database with transactions

Why do we have to load an object after just saving it? Well … because of Java. Java has very strict rules what you can do with (or to) an object instance after it has been created. One of them is that you can’t replace methods. So what, you’d think. In our case, things aren’t that simple. In our model, the name of a Knowledge instance is a Keyword. When you look at the code, you’ll see the standard setter. But when you run it, you’ll see that someone loads the item from the KEYWORD table. What is going on?

   1:
   2:
   3:
public void setName (Keyword name) {
    this.name = name;
}

setName() method

Behind the scenes, Hibernate replaces this method by using a proxy object, so it can notice when you change the model (setting a new name). The most simple soltuion would be to replace the method setName() in session.save() with calls the original setter and notifies Hibernate about the modification. In Python, that’s three lines of code. Unfortunately, this is impossible in Java.

So to get this proxy objects, you must show an object to Hibernate, let it make a copy (by calling save()) and then ask for the new copy which is in fact a wrapper object that behaves just like your original object but it also knows when to send commands to the database. Simple, eh?

Makes me wonder why session.save() doesn’t simply return the new object when it is more safe to use it from now on … especially when you have a model which is modified over several transactions. In this case, you can easily end up with a mix of native and proxy objects which will cause no end of headache.

Anyway. This approach has a few drawbacks:

  • If someone else creates the object, calls your code and then continues to do something with the original object (because people usually don’t expect methods to replace objects with copies when they call them), you’re in deep trouble. Usually, you can’t change that other code. You loose. Go away.
  • The proxy object is very similar but not the same as the original object. The biggest difference is that it has a different class. This means, in equals(), you can’t use this.getClass == other.getClass(). Instead, you have to use instanceof (the copy is derived from the original class). This breaks the contract of equals() which says that it must be symmetric.
  • If you have large, complex objects, copying them is expensive.
  • After a while, you will start to write factory methods that create the objects for you. The code is always the same: Create a simple object, save it, load it again and then return the copy. Apart from cut&paste, this means that you must not call new for some of your objects. Again, this breaks habits which leads to bugs.

All in all, the whole approach is clumsy. Really, it’s not Hibernate’s fault but the code is still ugly, hard to maintain (because it breaks the implicit rules we have become so used to). In Python, you just create the object and use it. The dynamic nature of Python allows the OR mapper to replace or wrap all the methods as it needs to and you never notice it. The code is clean, easy to understand and compact.

Another problem are the XML config files. Besides all the issues with Java XML parsers, it is always problematic to store the same information in two places. If you ever change your Java model, you better not forget to update the XML or you will get strange errors. You can’t refactor the model classes anymore because there is code outside the scope of your refactoring tool. And let’s not forget code completion which works pretty good for Java. Not so for XML files. If you’re lucky, someone has written a code completion for your type of XML config. Still, there will be problems. If there is a new version, your code completion will lag behind.

It’s like regexp: Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. — Jamie Zawinski

Fortunately, Sun solved this problem with JPA (or at least eased the pain). JPA allows to use annotations to store the mapping configuration in the class file itself. Apart from a few small problems (like setting up everything), this works pretty well. Code completion works perfectly because any IDE which has code completion will be able to use the latest and greatest version of your helper JARs without any customization. Just drop the new JAR in your classpath and you’re ready to do. Swell.

But there are more problems:

  • You must create a session object “somewhere” and hand it around. If you’re writing a webapp, this better be thread-safe. Not to mention you must be able to override this for tests.
  • The session object must track if you have already started a transaction and nest them properly or you will have to duplicate code because you can’t call existing methods if they use transactions.
  • Spring and AOP will help a lot but they also add another layer of complexity, you’ll have to learn another API, another set of rules how to organize your code, etc.
  • JAR file-size. My code is 246KB. The JARs it depends on take … 6’096KB, more than 40 times of my code. And I’m not even using Spring.
  • Even with JPA, Hibernate is not simple to use because Java itself is not simple to use.

In the end, the model was 5’400 LoC. A added a small UI to it using SWT/JFace which added 2’400 LoC.

If you look at the model in the previous installment, then the question is: Why do I need 5’000 LoC to write a program which implements an OR mapper for a model which has only three classes and 26 lines of code?

Granted, test cases and helper code take their toll. I could accept that this code needs four or five times the size of the model itself. Still, we have a gap.

The answer is that there are no or bad defaults. For our simple case, Hibernate could guess everything. Java could generate all the setters and getters, equals() and hashCode(). It’s no black magic to figure out that Relation has a reference to Knowledge so there needs to be a database table which stores this information. Sadly, defaults in Java are always “safe” rather than “clever”. This is the main difference to newer languages. They try to guess most of the stuff and then, you can fix those few exceptions that you always have. With Java, all the exceptions are handled but you have to do everyday stuff yourself.

The whole experience was frustrating, especially since I’m a seasoned Java developer. It took me almost two weeks to write the code for this small model mostly because because of a bug in Hibernate 3.1 and because I couldn’t get my mind around the existing documentation. Also, parent-child relations were poorly documented in the first Hibernate book. The second book explains this much better.

Conclusion: Use it if you must. Today, there are better ways.

Next stop: TurboGears, a Python web framework using SQL Objects.


What’s Wrong With Java, Part 1

11. July, 2007

As I promised at the Jazoon, I’m starting a series of posts here to present the reasons behind my thoughts on the future on Java. To do this, I’ll develop a small knowledge management application with the name Sensei, the Japanese word for “teacher”. The final goal is to have three versions in Java, Python an Groovy at the same (or at least a similar) level.

The application will sport the usual features: A database layer, a model and a user interface. Let’s start with the model which is fairly simple.

We have knowledge nodes that contain the knowledge (a short name and a longer description), keywords to mark knowledge nodes and relations to connect knowledge nodes. Additionally, we want to be able to organize knowledge nodes in a tree. Here is the UML:

To make it easier to search the model for names, I collect them in the keyword class. So a knowledge node has no String field with the name but a keyword field instead. The same applies to the relations. Here is what the Java model looks like:

   1:
   2:
   3:
   4:
   5:
   6:
   7:
   8:
   9:
  10:
  11:
  12:
  13:
  14:
  15:
  16:
  17:
  18:
  19:
  20:
  21:
  22:
  23:
  24:
  25:
  26:
class Keyword {
    public enum Type {
        KEYWORD,
        KNOWLEDGE,
        RELATION
    };

    static Set<Keyword> allKeywords;
    
    Type   type;
    String name;
}

class Knowledge {
    Keyword         name;
    String          knowledge;
    Knowledge       parent;
    List<Knowledge> children;
    Set<Keyword>    keywords;
}

class Relation {
    Keyword   name;
    Knowledge from;
    Knowledge to;
}

UML of Sensei Model

Note that I omitted all the usual cruft (private fields, getters/setters, equals/hashCode).

This model has some interesting properties:

  1. There is a recursive element. Many OR mappers show lots of example how to map a String to a field but for some reason, examples with tree-like structures are scarce.
  2. It contains 1:N mappings where N can be 0 (keywords can be attached to knowledge nodes but don’t have to be), 1:N mappings where N is always >= 1 (names of knowledge nodes and relations) and N:M mappings (relations between knowledge nodes).
  3. It contains a field called “from” which is an SQL keyword.
  4. There are sorted and unsorted mappings (children and keywords of a knowledge node).
  5. Instances of Keyword must be garbage collected somehow. When I delete the last knowledge node or relation with a certain name, I want the keyword deleted with it. This is not true for pure keywords, though.

To summarize, this is probably the smallest meaningful model you can come up with to test all possible features an OR mapper must support.

Let’s have a look at the features we want to support:

  1. Searching knowledge nodes by name or keyword or relation
  2. Searches should by case insensitive and allowing to search for substrings
  3. Adding child nodes to an existing knowledge node
  4. Reordering children
  5. Moving a child from one parent to another
  6. Adding and removing relations between nodes
  7. Adding and removing keywords to a knowledge node

In the next installment, we will have a look how Java and Hibernate can cope with these demands.