Artificial Intelligence – Winning Ways

[The following article is taken from the Jan. 2007 Issue of The Economist.]

Computers have started to outperform humans in games they used to lose

RESEARCHERS in the field of artificial intelligence have long been intrigued by games, and not just as a way of avoiding work. Games provide an ideal setting to explore important elements of the design of cleverer machines, such as pattern recognition, learning and planning. They also hold out the tantalising possibility of fame and fortune should the program ever clobber a human champion.

Ever since the stunning victory of Deep Blue, a program running on an IBM supercomputer, over Gary Kasparov, then world chess champion, in 1997, it has been clear that computers would dominate that particular game. Today, though, they are pressing the attack on every front. They are the undisputed champions in draughts and Othello. They are generally stronger in backgammon. They are steadily gaining ground in Scrabble, poker and bridge. And they are even doing pretty well at crossword puzzles. There is one game, however, where humans still reign supreme: Go. Yet here too their grip is beginning to loosen.

Go was invented more than 2,500 years ago in China (Confucius considered it a waste of time). It is a strategic contest in which two players take turns to place stones on the intersections of a grid with 19 lines on each side. Each player tries to stake out territory and surround his opponent. The rules are simple but the play is extraordinarily complex. During a game, some stones will “die”, and some will appear to be dead but spring back to life at an inopportune moment. It is often difficult to say who is winning right until the end.

Deep Blue and its successors beat Mr Kasparov using the “brute force” technique. Rather than search for the best move in a given position, as humans do, the computer considers all white’s moves—even bad ones—and all black’s possible replies, and all white’s replies to those replies, and so on for, say, a dozen turns. The resulting map of possible moves has millions of branches. The computer combs through the possible outcomes and plays the one move that would give its opponent the fewest chances of winning.

Unfortunately, brute force will not work in Go. First, the game has many more possible positions than chess does. Second, the number of possible moves from a typical position in Go is about 200, compared with about a dozen in chess. Finally, evaluating a Go position is fiendishly difficult. The fastest programs can assess just 50 positions a second, compared with 500,000 in chess. Clearly, some sort of finesse is required.

In the past two decades researchers have explored several alternative strategies, from neural networks to general rules based on advice from expert players, with indifferent results. Now, however, programmers are making impressive gains with a technique known as the Monte Carlo method. This form of statistical sampling is hardly new: it was originally developed in the Manhattan project to build the first nuclear bombs in the 1940s. But it is proving effective. Given a position, a program using a Monte Carlo algorithm contemplates every move and plays a large number of random games to see what happens. If it wins in 80% of those games, the move is probably good. Otherwise, it keeps looking.

This may sound like a lot of effort but generating random games is the sort of thing computers excel at. In fact, Monte Carlo techniques are much faster than brute force. Moreover, two Hungarian computer scientists have recently added an elegant twist that allows the algorithm to focus on the most promising moves without sacrificing speed.

The result is a new generation of fast programs that play particularly well on small versions of the Go board. In the past few months Monte Carlo-based programs have dominated computer tournaments on nine- and 13-line grids. MoGo, a program developed by researchers from the University of Paris, has even beaten a couple of strong human players on the smaller of these boards—unthinkable a year ago. It is ranked 2,323rd in the world and in Europe’s top 300. Although MoGo is still some way from competing on the full-size Go grid, humanity may ultimately have to accept defeat on yet another front.

Lessons Learned from Kalah

One of my course projects this quarter is to write a program that can play the game Kalah intelligently. I started with minimax search with alpha-beta pruning. This algorithm has been there for around thirty years, but it’s still powerful — even Deep Blue used it. Of course, I had to add some enhancements to make my program faster. The first thing came up to my mind was transpositon table (it’s basically a hash table). I tried to keep all the search results in this table so that they could be reused. However, this turned out to be a disaster: it didn’t make my program faster at all; it made my program ten times slower. So lesson 1: Learn to give up. Sometimes you just can’t keep them all. To be complete is not always the best. After I decided to keep only less than 1% of the search results, which I regarded as the most valuable, transposition table worked — now it made my program ten times faster. Analogously, when you find yourself overscheduled and underslept, which happens to be my current situation, don’t try to cut your sleep time further. It might be a solution, but definitely not good in the long run. To reprioritize, I think, is better and smarter.

Timeout is something I must consider for a real game. There’ll be some results from incomplete searches due to timeout. Shall I use them? I answered “yes” at first. And I applied some seemingly sophisticated rules to use them. Alas, my program became dull. So lesson 2: Don’t make important decisions with incomplete information. I tried some other rules. No matter how sophisticated they were, they couldn’t save my program. After I decided not to use the results from incomplete searches at all, my program became smart again. Sometimes, incomplete information can misguide you rather than help you. Then how to make decisions with incomplete information when you have to? Sorry, I don’t have the answer yet.