Takedown Instead of Touchdown

7. August, 2012

Scripps News Service (one company of the E. W. Scripps Group. The same group owns United Media who publishes the famous Dilbert comic strip) sent YouTube a DMCA takedown notice claiming they were owning the copyright of one of the videos in NASA’s YouTube channel (again).

It’s like Einstein said: “Only two things are infinite, the universe and human stupidity, and I’m not sure about the former.”

But now we have computers to bring dumbness to whole new levels.

Related:


Symphony of Science

31. July, 2012

When I reach to the edge of the universe
I do so knowing that along some paths of cosmic discovery
There are times when, at least for now,
One must be content to love the questions themselves

— Neil deGrasse Tyson

Symphony of Science is a YouTube channel where they mix awe-inspiring images with vocalized texts. It’s a bit hard to explain but easy to understand. Watch this:


Jazoon 2012: IBM Watson since Jeopardy!

29. June, 2012

From the summary:

In February 2011, IBM demonstrated its latest Research breakthroughs in natural language processing and deep question answering. Named Watson, it made history when it was entered into the famously complex US television quiz-show ‘Jeopardy!‘ where it comfortably beat two of the greatest human players ever to appear on the show. Since then, work has focused on bringing these breakthroughs to real-world problems.

If you haven’t seen the video, now is a good time: Episode 1, Episode 2, Episode 3

Before the show, Watson was trained with data from a variety of sources, including Wikipedia and dbpedia. The software is able to process both unstructured and structured data and learn from it. That means it converts the data into an internal representation that its various answer finding modules can then use. These modules include classic AI inferencing algorithms as well as Lucene based full-text search modules.

This is basically what makes Watson different: Instead of relying on a single, one-size-fits-all strategy, Watson uses many different strategies and each of them returns a “result” where a result consists of an answer and a “confidence” that this answer might be useful or correct.

Instead of mapping all the confidence values to a predefined range, each module can return any number for confidence. So some modules return values between 0 and 1, others from -1 to 1 and yet others return values between +/-∞ (including both). The trick is that Watson uses an extensive training session to learn how to weigh the outputs of the different modules. To do this, the correct answers for a large set of questions is necessary.

Which makes Jeopardy! such a perfect fit: They have accumulated the correct answers for thousands of questions that were asked during the show and that made it so “easy” to train Watson automatically because IBM engineers could debug the answering process when Watson erred.

But Watson isn’t about winning TV shows. The current goal is to turn Watson into a tool that can be used by doctors around the world to identify illnesses. Today, doctors work so many hours per week that they can only read a tiny fraction of all the articles that are published. Surveys show that 81% of doctors read less than 5h/month. One solution would be to hire more doctors. Guess what that would mean for costs in the health sector.

Or we could make Watson read all that and present all that blabla in a compressed form when the symptoms match. Think Google where you don’t know what you’re looking for.

Sounds good? Or frightening? Some people in the audience were thinking “Skynet” but here are some facts that you should know:

  • In health care, Watson is a “medical device”. These are heavily regulated.
  • The goal is not to have “Dr. Watson.” The goal is to give doctors a smart library, not a smart ass or something that can directly make unsupervised decisions about the next therapy step.
  • IBM isn’t developing the product alone. They are working with companies from the health care sector who know how doctors (should) work. You might want to see this video: Watson Computer Comes to the University of Maryland and have a look at this channel: IBMWatsonSolutions
  • Privacy is an important concern. Watson will see millions of medical records. There are pretty strict laws governing this (HIPAA)
  • Watson isn’t a data warehouse. It won’t process all the medical records into one huge data set which it can query. Instead, doctors will enter symptoms in a standardized way and Watson will present a list of things to check plus medical conditions that match.
  • For training, Watson needs a huge list of correct answers. It doesn’t try to find patterns by itself.

So unlike Skynet, Watson is much more like a boring tool. Sorry.

One very interesting aspect is that Watson is something that you won’t buy as a product. Instead, it’s probably going to be a cloud service which charges, say, per question.

Other fields where it would be useful:

  • Justice. Who has time to read all the laws and regulations that the government ships all the time?
  • Legislation
  • Engineering
  • Research in chemistry, physics and genetics

Related:


Fail Compilations

4. March, 2011

Need a good, hysteric laugh? Watch TwisterNederland7’s monthly Fail Compilations. Ten minutes of good, political incorrect fun.


Google Transparency Report

24. September, 2010

Google now shows usage data by country and service and how many requests it got from where to take items out of their search indexes, blogs, YouTube, etc.