The future of data

RFid Data Table from a BBC exhibition (CC: by-nc-nd)

People love to share. They share emotions, affection, information, files and personal data. But they don’t want to share that with everyone. Imagine sharing your bank statements with the IRS. Or that you just bought a very expensive TV set with a burglar. Or that you’re not at home for the next four weeks. Or photos and films of your children with a pedophile.

While people don’t talk about their private life in a public form, they do post it in social networks. They don’t want anyone to have a look at the data on their harddisks but they backup the very same data with online backup services. The line between private and the web is blurring.

Unfortunately, data can’t protect itself, so as soon as you put something online, anyone can see it, copy it, give it to someone else or keep a copy even after you deleted it yourself. The Internet doesn’t forget.

So the obvious solution is that data must become active. It must check who has permission to access it and only reveal its details to people who you have permission. How would that work?

Let’s have a look at ssh. At work, I’m accessing a server and work with an account but I have no idea what the password for that account is. How do I login? With my own credentials. I give a public key to the system administrator and he adds me to the list of people who can login. If he doesn’t want me anymore, he deletes the key from the list and I loose access. He doesn’t know my password and I doesn’t know his.

To achieve the same with data, the data must be encrypted. To decrypt it, users must ask a server for the decrypt key and identify themselves with their public key.

Of course, there are a couple of issues with this approach:

  1. First of all, it will bloat the data and make the processing (much) slower. Well, that might be an issue today but soon, progress will solve that.
  2. Users could decrypt the data once and then keep a decrypted copy. While this is true, is it an issue? First of all, these people had once access, so it will only become an issue if we want to revoke the access. Also, if they don’t backup the data regularly, a hardware failure will solve the problem sooner or later.

    Lastly, we could attach a license to data which disallows to share the decrypted copy with anyone. If anyone did, they could be sued for the license infringement. And let us not forget that most people won’t understand how this all works, so they won’t be able to do it. Plus as long as it works and it comfortable enough to use, they won’t see a reason to do it.

    For those who do understand the technology or want to abuse it, no amount of protection will be enough to stop them. This is why we have laws and courts.

  3. People could loose their data or their password. Happens all the time. But wouldn’t this approach solve both these issues? If all data was encrypted and there were servers to distribute credentials, people would have to remember just a single password for all services. The password could be strong and it could be changed with ease. Web sites could add users based on their public keys (just like ssh or OpenID). And there would be no need to worry about losing data since you could back it up with an online backup service since the encryption would happen before it is backed up.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s