Apache Kafka is a publish-subscribe messaging system like JMS but more simple. Messages are kept in files and never deleted. Together with the fact that subscribers have to tell the system which message they want next, this means you can recover from bugs that corrupted your data even if you notice them only after some time: Just process all the corrupted messages again.
Storm is a “distributed realtime computation system. ” It makes it easy to define topologies (= graphs) of bolts (= places where a computation takes place) flowing your real-time data through a complex network to process, filter and aggregate it. Just like Akka, it defines all kinds of operations (filters, switches, routers, …) so you can easily and quickly build the topology you need. Trident makes this set-up step even more simple.
Compared to Hadoop, Storm is meant for real-time processing. Some projects combine the two.
If you need a good framework for serializing data in Java, have a look at Apache Avro.