I’ve just release DecentXML 1.4, my very own XML parser implementation.
See the change log for changes.
It will show up on Maven Central in a couple of days.
Remember: i4i (is that “eye for an eye”?) owns patent 5,787,449: “A system and method for the separate manipulation of the architecture and content of a document, particularly for data representation and transformations.”
While the first sentence screams XML, it’s actually about a way to save additional data along with an XML document. Microsoft Word allows you to include any other file in the document, hence they violate the patent. Here is a good analysis.
This doesn’t mean anyone using XML is now prone to a lawsuit by i4i, but it’s still bad news. Why?
The parent was granted in 1998. In the very same year, the XML 1.0 standard was created (see here). This is just an example but patents are filed when the world starts to explore the very same field, obviously. We haven’t seen patents for combustion engines in 1603. And no patent office is going to accept patents for intergalactic FTL drives today.
Patents are filed to protect the investments of big companies. The pharmaceutical industry has to spend many million dollars to create a new medicine. Everything else has already been invents, so only the complex == expensive stuff is left. On this scale, it makes sense to generate billions in revenue since that’s about only one to ten thousand times what you invested. And you make that over many years.
IT is different. While the idea to store additional information along with a document might have been novel in 1998, it’s completely obvious today. The investment of i4i was probably on the scale of a few thousand dollars. Now, they made $290 million just by suing Microsoft.
My gut feeling is that they abuse the system. Pharmaceutical companies take great risks, i4i didn’t. i4i doesn’t sue everyone, they sue the big money. It’s perfectly legal. But is it right?
Here in Germany, we have the term of “Rechtsfrieden” which means “peace of law.” People believe and follow the law because it appears to be just. Violating the peace of law means that someone uses perfectly legal ways to harass someone. Think of a lawyer who got dumped by his girlfriend and now uses all the tiny transgressions we all do to turn her live into hell. She parks where she shouldn’t, he send a photo to the police. She drives a bit too fast, another fine. Talking with her mobile on the wheel. Telling people that she is a serial offender but no details, lest he could get into trouble. This behavior creates the impression on other people that the law can easily be used against them. The trust that the law needs to be efficient is undermined.
From my point of view, patent trolls violate the peace of law. They invest little and try to milk society. The damage is much bigger than the $290 million fine. M$ had to withdraw an entire production of Office products, they had to pay a fortune in lawyer fees, and now every software company using a similar technology is under even more stress than before: i4i just got the money to drive anyone out of business. Because today, almost every software company uses technology like that. It’s so obvious today that no one would even think that there might be a patent for it.
And that’s the fundamental problems around software patents: They don’t make sense on any level.
Other industries have to invest millions of dollars in equipment and thousands of people (in the field, lab workers, people building lab equipment, test subjects) and procedures (clinical or other tests, legal reviews, patent research) to develop new products. Actually producing those products is expensive: You need workers, factories, raw material. And then, you haven’t sold a single unit. So you need transportation, packaging, hygiene environments, storage, advertising, sales points, etc.
To bring a new medicine to market, you need one billion dollars today. That is a huge risk. While I don’t like patents, I can understand that you want all the protection you can get in this case.
Software patents are dirt cheap by comparison. Usually, it takes just one person to have the idea. You need equipment that costs a couple of thousand dollars. Even 1998, computers usually cost less than $10’000. Developing the idea to a real patent is in the same range. You don’t need expensive equipment for that, just determination and a good patent lawyer.
Basically, there is no risk in developing a software patent. If the patent is found void, you also don’t lose much. It doesn’t mean your investment is lost. It doesn’t mean your multi-million dollar factory is ripe for an unexpected amortization. It doesn’t bankrupt you.
On the other hand, a software patent is a great tool to harm society, 100% legal. That $290 million isn’t coming out of the pockets of Microsoft, it’s ultimately coming out of the pockets of their customers. The fine doesn’t benefit society, it goes to the owners of i4i. And rich people don’t share.
The judges in the M$ vs. i4i case argued that the government should set the rules. Which sounds good. But apparently, the members of parliament also don’t understand that we have two completely different sets of problems. When biochemical companies argue pro patents, they ignore the fact that one size only fits all when everyone is the same size.
Conclusion: In my opinion, i4i legally “swiped” $290 million from society. Which is a perfect argument to treat software patents completely different from normal patents.
In this issue of “The Next Big Thing”, I’ll talk about something that every software uses and which is always developed again from scratch for every application: Persistence.
Every application needs to load data from “somewhere” (user preferences, config settings, data to process) and after processing the data, it needs to save the results. Persistence is the most important feature of any software. Without it, the code would be useless.
Oddly, the most important area of the software isn’t a shiny skyscraper but a swamp: Muddy, boggy, suffocating.
Therefore, the next big thing in software development must make loading and saving data a bliss. Some features it needs to have:
When developing software, one pain is always the same: Getting valuable input from the customer. One solution is to use a tool that your customer probably understands: Excel.
“Oh, great, CSV files,” you think. No. I mean Excel work sheets, with colors and formatting and formulas. Let me explain two approaches to save you a lot of time and effort, pain and tears.
Apache POI is the Swiss Army Knife for Java developers who have to deal with Excel. It basically allows you to access the content of an Excel document in a humane way: By book, sheet and then indexed by row and column. It doesn’t matter if the file is in the old binary format (*.xls) or the new XML one (*.xlsx).
Especially the XML format is suited perfectly to allow your customers to send you master data. Just write a small tool that reads the Excel file directly and dumps it into your database. No CSV conversion! You can use colors to mark rows/columns to ignore. You can define styles to ignore (like “Section” or “Heading”). That allows your customer to keep the table readable.
But the data in the spreadsheets is only half of the value. The guys at Abacus Research AG created a new project: Abacus Formula Compiler (AFC). This gem gives you access to the formulas in the cells. POI does that, too, but you will just get a string. AFC will convert the formulas into Java Code. That means your customer can not only send you master data but also how to process it.
They can send you a spreadsheet with an example what they are talking about. They can use the Excel sheets they have been using to drive their business for the last five years. If something changes, they can send you an updated version, you run AFC on that and get a new set of classes which replace the old ones and which do exactly what your customer wants.
The W3C has rolled all their validators (HTML, XML, CSS) into one: Unicorn.
If you ever tried to enable logging for OSGi (Equinox) because starting the BIRT engine fails for mysterious reasons, you will have noticed that BIRT removes all
osgi.* options from the
System.properties before it launches (see ).
Instead, it expects these options in
config.ini (which must be in the current folder):
# Specify the file with the debug options. See the .options file in the org.eclipse.osgi*.jar for examples osgi.debug=/path/to/file/with/debug.options # Change the classloader. Possible values are: "app", "fwk", "boot" (default) # app: Use the current SystemClassLoader # boot: Use the boot classloader # fwk: Use the classloader which was used to load OSGi. #osgi.parentClassloader=fwk
fwk if you see errors because of missing XML parser classes. The Java runtime has a private static field which contains the XML parser factory and if you touch any XML code before you start OSGi, then that field will be set and OSGi will be forced to use this XML parser — only the default boot classloader can’t see the parser. Bummer.
I’ve updated my XML parser. The tests now cover 97.7% of the code (well, actually 100% of the code which can be executed; there are a couple of exceptions which will never be thrown but I still have to handle them) and there are classes to read XML from InputStream and Reader sources (including encoding detection).
The XMLInputStreamReader class can be used standalone, if you ever want to read an XML file with the correct encoding.