How did Watson get so smaht?

At yesterday afternoon’s Profiles in Innovation lecture, IBM Watson creator David Ferrucci explained (very quickly I might add — that man talks fast) how the supercomputer came to think like a human and beat Jeopardy champion Ken Jennings at the TV quiz show last year.

Okay, maybe “think” isn’t quite the right word. But this is what I found so interesting: the team didn’t set out to build a computer that thinks like a human at all, but in the end, to Jennings’ own surprise, Watson’s process of reasoning, which is based completely on mathematical algorithms, is very similar to a human’s process of reasoning.

To get Watson to “think,” the team used machine learning, which really isn’t all that new or exciting on its own. Amazon and Netflix, for example, have been doing this for years. Each time you give the website feedback on your preferences, it learns a little more about you and gets better and better at making book or movie recommendations that you will actually enjoy.

But with Watson it wasn’t that simple. Because Jeopardy questions use complex language and aren’t at all straightforward, Watson had to dig into masses and masses of data to generate answers. The data, Ferucci explained, wasn’t simple five star ratings, either. It came from long textual passages from a variety of sources. Watson had to sift through all the passages the team gave it over four years of training and use something called “plausible inference” to figure out what the right answer could be.

Example: Lincoln once said: “Treasury secretary Chase just submitted this to me for the third time. Guess what, pal? This time I’m accepting it.” What is “this”? Someone else in the audience, not me, quickly said “his resignation,” which the man was able to infer from the sentence based on his knowledge that Lincoln was a president, that treasury secretaries only submit a few things to presidents, and from the tone of voice it probably wasn’t some great bill proposal. When this question was posed to a classroom of six graders the common answer wasn’t “his resignation” but “a facebook friend request.”

It all comes down to our experience. A sixth grader has a much different experience of the world than an old dude who’s lived through many frustrating political administrations.

And so where does that leave a computer?

The more information the team feeds the computer the more it has the chance to learn from it. But how does it then decide which information to pay attention to and which to ignore. Ferrucci said that Watson was made up of not just one machine learning algorithm, as in the case of Netflix or Amazon, but several hundred. Each contributes a tiny bit of information to the final answer, and each has a different confidence level.

Say ten of us are sitting around a table and are asked the same question. Each of our experiences will lead us to a separate, although possibly identical, answer. And each of us will have a different level of confidence in our answer. Watson averages up all of those answers and the confidences they are submitted with to generate his final answer.

And then of course it gets even more interesting. Jeopardy players only answer questions a certain number of times. The typical winner answers around 50% of the questions. They know what they don’t know. And when they do answer, they do so correctly a majority of the time. So Watson had to mimic this balancing act. Yet another algorithm taught it to answer questions when the average confidence was above a certain threshold. And of course, that threshold would change based on how well he was doing in the game.

So, I have to run to the annual RISE Conference now (Research, Innovation, and Scholarship Expo) and find out which of our Northeastern students will be the next David Ferrucci. But isn’t it interesting that while the Watson team didn’t set out to mimic human thinking behaviors, the best way for it to answer Jeopardy questions naturally evolved to do just that?