Jazoon 2011, Day 2 – NoSQL – Schemaless Data-stores Not Only for the Cloud – Thomas Schank

26. June, 2011

NoSQL – Schemaless Data-stores Not Only for the Cloud – Thomas Schank

Thomas gave an overview of some NoSQL databases and the theoretical background of it.

The main points are: SQL databases get inefficient as the data grows and if you need to split the data between instances (how do you join tables between two DB servers? Even if you can, performance is going to suffer).

But there are new problems: Data can be inconsistent for a while (keyword: MVCC).

OTOH, these databases don’t need locks and, as Amazon demonstrated, any kind of lock will eventually become a bottleneck:

Each node in a system should be able to make decisions purely based on local state. If you need to do something under high load with failures occurring and you need to reach agreement, you’re lost. If you’re concerned about scalability, any algorithm that forces you to run agreement will eventually become your bottleneck. Take that as a given.

Werner Vogels, Amazon CTO and Vice President

And since the servers can heal inconsistencies, broken connections don’t mean the end of the world. The acronym of the hour is BASE: Basically Available, Soft-state, Eventually consistent.

Interesting stuff, especially since every company owns a super-computer today. My own team has one with 64 cores, 64GB of RAM and 16TB of disk space sitting in 8 boxes spread under the desks of the developers. Not much but it only costs $8 000. And if we need more, we can simply buy another node. Unfortunately, it’s hard to leverage this power. I’ll come back to that in a later post.

One thing to take with you: If you can’t stand JavaScript but you need to write it, have a look at CoffeeScript.