There is no Spoon: Changing Final Fields in Java

21. January, 2013

If you’re the guy in the team who solves the impossible problems, you will eventually run into the worst of all design patterns: Singletons. Which are implemented in Java using final (static) fields.

Sebastian Zarnekow came up with a way to change that. Following the timeless advice from the Matrix – there is no spoon -, he found a way to modify (some) final fields.

As I said before, this is a desperate measure, so use it wisely. But remember this tool next time when you need to mock a singleton for a test case.


Java Toolbox

9. November, 2012

 posted an article with some tools that you should know about when developing Java code: “A Software Craftsman’s Toolbox: Lightweight Java libraries that make life easier.

Along the same lines, Jeeeyul came up with an idea to make

System.out.println( "Hello World." );

produce this output:

(MyHelloWorld.java:10) : Hello World.

Just takes 34 lines of code: “Make System.out.println() Rocks!


Enums With More Than One Name

8. October, 2012

In Java, you sometimes encounter places where you need an enum with more than one name (or key). Here is the pattern that I use:

import java.util.HashMap;
import java.util.Map;

enum X {
    // "A" is the name of the enum, "a" is the second name/key.
    A("a"), B("b");

    private final static Map<String,X> MAP = new HashMap<String,X>();
    static {
        for( X elem: X.values() ) {
            if( null != MAP.put( elem.getValue(), elem ) ) {
                throw new IllegalArgumentException( "Duplicate value " + elem.getValue() );
            }
        }
    }

    private final String value;

    private X(String value) { this.value = value; }
    public String getValue() { return value; }

    // You may want to throw an error here if the map doesn't contain the key
    public static X byValue( String value ) { return MAP.get( value ); } 
}

Things to note:

  1. There are additional parameters in () after the enum name.
  2. You need a custom constructor which accepts the additional parameters. Like other Java classes, you can have as many constructors as you need.
  3. I’m filling the static map from a static block inside of the enum declaration. Looks odd but works. Java will first create all instances and then invoke the static code in my custom enum.
  4. You can look up enum values by using the static method byValue(). The name is not very good (it’s easy to get confused with enum‘s valueOf()). When the field is called code, I use byCode(). So in real life, it will be less confusing.

Xtend for Java Developers

2. October, 2012

There are a couple of common pitfalls when a Java developer starts using Xtend.

Java Xtend Description
String.class typeof(String) Get the class instance of a type
Long.MAX_VALUE Long::MAX_VALUE Accessing static fields
Foo.Bar Foo$Bar Accessing inner classes

Example: org.slf4j logging

private Logger log = LoggerFactory.getLogger(Foo.class)    // Java

        Logger log = LoggerFactory::getLogger(typeof(Foo)) // Xtend

Also, the .. or upTo operator has a severe bug. The code for i: 0..list.size won’t work as expected. First of all, it will iterate once too many.

The obvious fix for i: 0..(list.size-1) doesn’t work when the list is empty because it will iterate twice (0, -1) and it won’t iterate at all if the list has a single element.

Use .. only with constant operands (i.e. 1..5 is OK, list.size..0 isn’t). If you need to iterate over a range [start … end), use this gist instead.


Excellent Explanation of PermGen Issues

27. July, 2012

If you develop web apps, you have encountered java.lang.OutOfMemoryError: PermGen.

Nikita Salnikov-tarnovski wrote an excellent article where these come from and how to solve them: Busting PermGen Myths


Jazoon 2012: CQRS – Trauma treatment for architects

4. July, 2012

A few years ago, concurrency and scalability were a hype. Today, it’s a must. But how do you write applications that scale painlessly?

Command and Query Responsibility Segregation (CQRS) is an architectural pattern to address these problems. In his talk, Allard Buijze gave a good introduction. First, some of the problems of the standard approach. Your database, everyone says, must be normalized.

That can lead to a couple of problems:

  • Historic data changes
  • The data model is neither optimized for writes nor for queries

The first problem can result in a scenario like this. Imagine you have a report that tells you the annual turnover. You run the report for 2009 in January, 2010. You run the same report again in 2011 and 2012 and each time, the annual turnover of 2009 gets bigger. What is going on?

The data model is in third normal form. This is great, no data duplication. It’s not so great when data can change over time. So if your invoices point to the products and the products point to the prices, any change of a price will also change all the existing invoices. Or when customers move, all the addresses on the invoices change. There is no way to tell where you sent something.

The solution is to add “valid time range” to each price, address, …, which makes your SQL hideous and helps to keep your bug tracker filled.

It will also make your queries slow since you will need lots and lots of joins. These joins will eventually get in conflict with your updates. Deadlocks occur.

On the architectural side, some problems will be much easier to solve if you ignore the layer boundaries. You will end up business logic in the persistence layer.

Don’t get me wrong. All these problems can be solved but the question here is: Is this amount of pain really necessary?

CQRS to the rescue. The basic idea is to use two domain models instead of one. Sounds like more work? That depends.

With CQRS, you will have more code to maintain but the code will be much more simple. There will be more tables and data will be duplicated in the database but there will never be deadlocks, queries won’t need joins in the usual case (you could get rid of all joins if you wanted). So you trade bugs for code.

How does it work? Split your application into two main parts. One part takes user input and turns that into events which are published. Listeners will then process the events.

Some listeners will write the events into the database. If you need to, you will be able to replay these later. Imagine your customer calls you because of some bug. Instead of asking your customer to explain what happened, you go to the database, copy the events into a test system and replay them. It might take a few minutes but eventually, you will have a system which is in the exact same state as when the bug happened.

Some other listeners will process the events and generate more events (which will also be written to the database). Imagine the event “checkout”. It will contain the current content of the shopping cart. You write that into the database. You need to know what was in the shopping basket? Look for this event.

The trick here is that the event is “independent”. It doesn’t contain foreign keys but immutables or value objects. The value objects are written into a new table. That makes sure that when you come back 10 years later, you will see the exact same shopping cart as the customer saw when she ordered.

When you need to display the shopping cart, you won’t need to join 8 tables. Instead, you’ll need to query 1-2 tables for the ID of the shopping cart. One table will have the header with the customer address, the order number, the date, the total and the second table will contain the items. If you wanted, you could add the foreign keys to the product definition tables but you don’t have to. If that’s enough for you, those two tables could be completely independent of any other table in your database.

The code to fill the database gets the event as input (no database access to read anything from anywhere) and it will only write to those two tables. Minimum amount of dependencies.

The code to display the cart will only need to read those two tables. No deadlocks possible.

The code will be incredibly simple.

If you make a mistake somewhere, you can always replay all the events with the fixed code.

For tests, you can replay the events. No need to a human to click buttons in a web browser (not more than once, anyway).

Since you don’t need foreign keys unless you want to, you can spread the data model over different databases, computers, data centers. Some data would be better in a NoSQL repository? No problem.

Something crashes? Fix the problem, replay the events which got lost.

Instead of developing one huge monster model where each change possibly dirties some existing feature, you can imagine CQRS as developing thousands of mini-applications that work together.

And the best feature: It allows you to retroactively add features. Imagine you want to give users credits for some action. The idea is born one year after the action was added. In a traditional application, it will be hard to assign credit to the existing users. With CQRS, you simply implement the feature, set up the listeners, disable the listeners which already ran (so the action isn’t executed again) and replay the events. Presto, all the existing users will have their credit.

Related:


Jazoon 2012: Spring Data JPA – Repositories done right

4. July, 2012

Oliver Gierke presented “Spring Data JPA – Repositories done right” at the Jazoon. The motto of Spring Data could be “deleted code doesn’t contain bugs.” From the web site:

Spring Data makes it easier to build Spring-powered applications that use new data access technologies such as non-relational databases, map-reduce frameworks, and cloud based data services as well as provide improved support for relational database technologies.

Spring Data is an umbrella open source project which contains many subprojects that are specific to a given database. The projects are developed by working together with many of the companies and developers that are behind these exciting technologies.

When you use any form of JPA, you will eventually end up with DAOs which contain many boring methods: getById(), getByName(), getByWhatever(), save(), delete(). How do you like this implementation:

interface MyBaseRepository<T, ID extends Serializable> extends Repository<T, ID> {
  T findOne(ID id);
  T save(T entity);
}

interface UserRepository extends MyBaseRepository {
  User findByEmailAddress(EmailAddress emailAddress);
}

“Wait a minute,” I can hear you think, “these are just interfaces. Where is the implementation?”

That is the implementation. You can now inject those interfaces as DAOs and call the methods. Behind the scenes, Spring will generate a proxy for you that actually implements the methods. 0 lines of code for you to write for 95% of the basic DAO methods.

The queries can even be more complex:

List findByEmailAddressAndLastname(EmailAddress emailAddress, String lastname);

The method will generate SQL that searches by those two columns. See the documentation for more examples how you can write queries that use joins.

On top of that, they built a REST exporter which exposes your DAO interfaces with a REST API to a web browser plus a web front end to explore the repository, to run the queries and to create new objects. Impressive.


Jazoon 2012: Divide&Conquer: Efficient Java for Multicore World

29. June, 2012

Not much new in the talk “Divide&Conquer: Efficient Java for Multicore World” by Sunil Mundluri and Velmurugan Periasamy.

Amdahl’s law shows that you can’t get an arbitrary speed-up when running part of your code in parallel. In practice, you can expect serial code to execute 2-4 times faster if you run it with, say, the fork/join framework of Java 7. This is due to setup + join cost and the fact that the tasks themselves don’t get faster – you just execute more of them at the same time. So if a task takes 10 seconds and you can run all of them in parallel, the total execution time will be a bit over 10s.

If you want to use fork/join with Java 6, you can add the jsr166y.jar to your classpath.

Again, functional programming makes everything more simple. With Java 8 and lambda expressions, syntactic sugar will make things even more readable but at a price.

You might want to check one of today’s new languages like Xtend, Scala or Groovy to get these features today with Java 6.


Jazoon 2012: Serialization: Tips, Traps, and Techniques

29. June, 2012

Every once in a while, you learn something new even though you thought you’d know it all. That’s what happened to me during the talk “Serialization: Tips, Traps, and Techniques” by Ian Partridge.

Serialization is such a basic, old technique that it’s surprising that you can learn something new about it. In a nutshell, serialization converts between object graphs and byte streams.

Unfortunately, the API is one of the oldest in the Java runtime. And it’s not one of the best. On the other hand, it’s used in many places like RMI, EJB, SDO, JPA and distributed caching.

What did I learn? Let’s see.

Did you know that it’s possible to serialize a class (without error) that you can’t read back in? It’s actually pretty simple to do: Don’t provide a default constructor (you know, those without any arguments).

You also shouldn’t try to serialize non-static inner classes because they keep a hidden reference to the outer instance.

When you use serialization, then you must take into account that the serialized form becomes part of the public API. This means that private and even final fields are suddenly part of your API that you need to document. Why? Because ObjectInputStream creates an instance using the default constructor and then it sets the final fields using Unsafe.putObject().

If you check the parameters in your constructor, then you will have to repeat them in readObject(). On top of that, you can’t trust the instances which you get from the Serialization API. An attacker can manipulate the byte stream to get references to internal data structures which you most certainly don’t want to expose.

There were *Unshared() methods added with Java 1.4 to solve these but they don’t work. Forget about them.

Anything else? Oh, yes: The serialVersionUID. Besides all the known problems, there is another one:

private static int COUNTER = 0;</pre>
public static class Version1 implements Serializable {
    void foo() {
        COUNTER = COUNTER + 1;
    }
}

If someone fixes this code to

private static int COUNTER = 0;</pre>
public static class Version1 implements Serializable {
    void foo() {
        COUNTER += 1;
    }
}

then deserialization fails for some versions of Java because the generated hidden accessor methods change.

The Serializable Proxy Pattern solves many of the problems.

If you use proxies, consider using Externalizable instead of Serializable


Jazoon 2012: Akka 2.0 – Scaling up and out with Actors

29. June, 2012

Concurrency is too hard but we need it. In his talk “Akka 2.0 – Scaling up and out with Actors,” Viktor Johan Klang showed new features of Akka 2.0.

The framework now uses Future to create pipes between actors and Promise to write data to, say, a stream (docs).

To make error handling more simple, there is now “parental supervision.”

Decoupling actors becomes even more with the Event Bus API.

There is support for ZeroMQ to create grids/meshes of actors (docs).

But every framework has its limitations. If you hit one of those, it’s usually either “Use the Source, Luke” or “You’re out of luck”. Akka 2.0 comes with a new extensions mechanism to hook into the framework.


%d bloggers like this: