Previously, I discussed how Watson, IBM’s Jeopardy!-playing computer prodigy, manages to match or surpass the performance of humans on the game. The computer—which is really a cluster of servers reading hundreds of millions of pages’ worth of stored information about the world—can simultaneously generate and evaluate thousands of hypothetical responses to clues on the game board. It decides which of those possible answers seems best supported by all the facts (including what Watson learns about the kinds of responses the game’s producers want) and then, if any of those answers meets its “confidence” criteria, the computer buzzes in. To be a good player, Watson has to accomplish all of these feats, from interpreting the text of the clue to choosing a best response, in less time than its human competitors—on average, about three seconds.
To any of us who have gritted our teeth through the feebleness and inadequacies of most automated response systems on phone directories and help lines, that level of performance can seem breathtaking. In fact, it so closely mimics the kind of artificial intelligence (AI) seen in science fiction, one might wonder whether researchers really are on the verge of creating robotic brains to rival our own any day now.
The short answer is no, although Watson’s success probably does augur a satisfyingly rapid boost to come in the usefulness of a wide variety of automatic systems. Bullish believers in the prospects for AI may see Watson as emphatic proof that they are on the right track, but I think a slightly more restrained reaction is probably in order. The crucial issue, I’ll argue, is scale.
To state the obvious, IBM didn’t build Watson to make money from game shows. With its ongoing DeepQA program for building systems like Watson that can knowledgeably respond to questions asked of it in standard, casual English, the company has its eyes on far bigger prizes. Research manager Eric Brown notes that IBM is looking at specific applications in medicine and healthcare, as well as “things like help desk, tech support, and business intelligence applications … basically anyplace where you need to go beyond just document search, but you have deeper questions or scenarios that require gathering evidence and evaluating all of that evidence to come up with meaningful answers.”
Similarly, in his June 16, 2010 article on Watson for the New York Times Magazine, Clive Thompson reported (emphasis added):
I.B.M. PLANS TO begin selling versions of Watson to companies in the next year or two. John Kelly, the head of I.B.M.’s research labs, says that Watson could help decision-makers sift through enormous piles of written material in seconds. Kelly says that its speed and quality could make it part of rapid-fire decision-making, with users talking to Watson to guide their thinking process.
“I want to create a medical version of this,” he adds. “A Watson M.D., if you will.” He imagines a hospital feeding Watson every new medical paper in existence, then having it answer questions during split-second emergency-room crises. “The problem right now is the procedures, the new procedures, the new medicines, the new capability is being generated faster than physicians can absorb on the front lines and it can be deployed.” He also envisions using Watson to produce virtual call centers, where the computer would talk directly to the customer and generally be the first line of defense, because, “as you’ve seen, this thing can answer a question faster and more accurately than most human beings.”
(I will be interested to see if M.D.s take easily to relying on such DeepQA/Watson-type automated helpers. The stakes in medicine being as high as they are, and physicians’ egos being what they are, I can picture many doctors resisting putting too much faith into these systems when they are new. On the other hand, how might juries in malpractice suits look on physicians who went against the advice of such digital helpers?)
All of us who suffer through the aforementioned types of automated help-line hells may therefore also hope for a reprieve within just a few years. Machines capable of Watson-level general knowledge may remain out of many company’s budgets for years, but that may not matter. To play Jeopardy!, Watson needs a strongly diverse base of knowledge in history, literature, science, the arts and so on. The automated systems of most businesses would not. My travel agency’s automated booking system, for example, wouldn’t need to know the poems of Emily Dickinson or the date of the Norman conquest; it would need to know geography, vacation packages, airline schedules, my travel history, visa and vaccination requirements and the like.
It’s true that in any conversation with a member of the public, a digital helper could encounter references that it would not understand. But part of the beauty of the approach to answering queries that Watson is helping to pioneer is that the system recognizes when it does not really know a potentially trustworthy answer. In a conversation, the computer can then always ask for additional information to help it improve its deliberations, rather than blurting out an answer that is best but still not good enough. Thus, this DeepQA approach—and similar approaches that other companies could develop (because I don’t want to rule them out)—could scale down for many applications quite well, I think.
Now come the more philosophical questions. How close does something like Watson bring us to the goal of creating true artificial intelligences? The longstanding benchmark for an AI to pass is the Turing test, meaning that the machine could not be distinguished as nonhuman from its replies.
Even those close to the Watson project dismiss the idea that the system represents a Turing-level intelligence. Eric Brown, for example, remarks that Watson might be indistinguishable from a human playing Jeopardy!, but it lacks any good capability for general conversation. Stephen Wolfram, the computer scientist behind Mathematica and Wolfram Alpha, argues that Watson can only answer questions with objectively knowable facts and that it cannot offer a judgment.
Nevertheless, David Ferrucci, who headed the Watson project, seems hopeful that it offers useful lessons for bringing computer scientists closer to that Turing goal, or even more ambitious ones (via NYTimes):
At best, Ferrucci suspects that Watson might be simulating, in a stripped-down fashion, some of the ways that our human brains process language. Modern neuroscience has found that our brain is highly “parallel”: it uses many different parts simultaneously, harnessing billions of neurons whenever we talk or listen to words. “I’m no cognitive scientist, so this is just speculation,” Ferrucci says, but Watson’s approach — tackling a question in thousands of different ways — may succeed precisely because it mimics the same approach. Watson doesn’t come up with an answer to a question so much as make an educated guess, based on similarities to things it has been exposed to.
Ferrucci may indeed be right and Watson may embody the kernel of an insight into human thinking that might steer scientist toward better high-level artificial intelligences. However, it is also surely true that the human brain does not think simply by hatching and evaluating thousands of possible responses to every situation in the way that Watson does. Machine intelligences certainly do not need to work the way our brains do. But if the goal is to create an artificial intelligence that can match a human one, science will also need to be alert to efficient alternatives in our neurosystems that can help machines scale up.
As effective a general savant as Watson is in the context of Jeopardy!, it is still a computer optimized to do one thing: play that game. A machine with exactly the same approach that could be equally versatile “in the wild” would need to be much, much more powerful. That sort of brute force approach might work; it is, after all, a big part of how Deep Blue beat Garry Kasparov in their chess tournament. But it is probably a wildly inefficient way to build a machine with human-level cognition. Computing power may indeed be increasing exponentially, but expanding the capabilities of something like Watson toward that end might involve a processing problem that escalates even faster.
Eventually, of course, if nothing constrains the increase and application of the computing power, then even that horrific hypothetical level of brute force needed to simulate human intelligence would become available. But that would not represent a scalable solution except in a universe of unlimited resources and boneheaded stubbornness.
This, I suspect, is why strong optimists about AI such as futurist Ray Kurzweil (who foresees a computer passing the Turing test in 2029 and has made a wager to that effect with Mitch Kapor) and others who are more reserved or pessimistic (whom I’m tempted to call “neurorealists”) may argue right past each other. Kurzweil believes in exponentially accelerating technological growth that will overrun all obstacles. To those optimists, objections about the scale of certain approaches to AI are irrelevant because the passage of time will put any needed amount of computing power within reach. To the neurorealists, piling on computational resources without any clear regard for what might be a biologically guided way of deploying them makes it preposterous to think that anyone will bother with such a project. And because we currently have only the faintest glimmers about how such higher cognitive abilities emerge from our brains, the day when we can translate those mechanisms into something suitable for AI seems remote, too.
Both sides have a point. That’s why I’m leery of following those skyrocketing curves of technological growth to any imminent arrival of machine sentience in the absence of real breakthroughs in understanding how minds arise from brains. At the same time, as Watson does very well demonstrate, we can look forward to computers that can at least seem perfectly intelligent within narrow scopes very soon.
For more information:
- FAQs on “Watson and Jeopardy” and IBM’s DeepQA project, prepared by the company.
- “How Watson works: a conversation with Eric Brown, IBM Research Manager,” by Amara D. Angelica (KurzweilAI.net, Jan. 31, 2011).
- “How IBM Plans to Win Jeopardy!,” by David Talbot (Technology Review, May 27, 2009).
- “Smarter than You Think: What is IBM’s Watson?” by Clive Thompson (The New York Times, June 16, 2010).
- “How IBM Built Watson, Its ‘Jeopardy’-playing Supercomputer,” by Dawn Kawamoto (Daily Finance, Feb. 8, 2011).
- “Behind-the-Scenes with IBM’s ‘Jeopardy’-playing Computer, Watson,” by John D. Sutter (CNN, Feb. 7, 2011).
- “Will Watson Win on Jeopardy?” PBS NOVA (Jan. 20, 2011).
The Not-So-Elementary Watson: What IBM’s Jeopardy! Computer Means for Turing Tests and the Future of Artificial Intelligence by Retort, unless otherwise expressly stated, is licensed under a Creative Commons Attribution 4.0 International License.