New York Times
When it comes to cheating, chess might seem all but invulnerable. After all, the board and its pieces are out in the open for all to see.

But an eruption of recent scandals has made it clear that cheating — fueled by powerful computer programs that play better than people do, as well as sophisticated communication technologies — is becoming a big problem for world championship chess.

Last year the French Chess Federation accused three players of colluding at the Chess Olympiad in Russia in 2010 by using coded text messages and a signaling system. The federation banned the players for five years, though the ruling is under appeal.

Of course, elite players are elite precisely because they win lots of games. When they come under suspicion, how can officials determine whether they are cheating? That is where Kenneth W. Regan comes in.

An associate professor of computer science at the University at Buffalo who is also an international master at chess, Dr. Regan has been researching the problem for five years and was an expert witness in the French case — though his principal focus is the holy-grail math problem P versus NP. (P versus NP is about whether problems that have solutions that can be verified by a computer can also be solved quickly by a computer.)

Dr. Regan, 52, became interested in the chess issue during the 2006 world championship match between Vladimir Kramnik of Russia and Veselin Topalov of Bulgaria.

The match came to a halt after Silvio Danailov, Mr. Topalov’s manager, accused Mr. Kramnik of consulting a computer in the bathroom. The organizers locked the bathroom, whereupon Mr. Kramnik forfeited a game and refused to continue unless the bathroom was unlocked. It eventually was; he went on to win the match, and the incident went down in chess lore as Toiletgate.

“I thought that the chess world needed someone to try to help investigate these accusations,” said Dr. Regan, who acted as an online commentator during the brouhaha. “I felt I was qualified. I felt some definite responsibility.”

In constructing a mathematical proof to see if someone cheated, the trouble is that so many variables and outliers must be taken into account. Modeling and factoring human behavior in competition turns out to be very difficult.

Part of the problem is that sample sizes tend to be small — maybe 150 or 200 moves per player for an entire tournament. Another problem lies in how computerized chess programs evaluate positions. They are given in increments of one-hundredth of the value of a pawn, the least valuable piece.

“A change of a hundredth of a pawn might change the agreement with the computer,” Dr. Regan said.

The potential payoff for a proof of cheating goes well beyond chess. Jonathan Schaeffer, a professor of computer science at the University of Alberta and the inventor of Chinook, the computer that solved checkers, said that Dr. Regan’s research, and that of others who are also investigating this field, has great potential value.

“What he is doing, what these people are doing, is they are trying to model how people make decisions,” Dr. Schaeffer said.

That could also be of immense value to a big online retailer, like Amazon, that wants to customize its offerings, or for more important uses, like personalizing medical treatment.

Dr. Schaeffer said that these applications had probably occurred to Dr. Regan. “The thing I would say about Ken, although he is using this research in the context of his hobby and passion, he is a big thinker,” he said.

In analyzing the 2006 match, the first thing Dr. Regan tried to do was reproduce Mr. Danailov’s claims, but he did not know how. “Initially, I was just a newbie to computer chess,” he said. “I didn’t even know the right questions to ask.”

He tried to find articles on the subject, but turned up nothing. “It is one of those situations that it is hard to believe that this hasn’t already been covered in the literature,” he said.

So he decided to create his own solution.

He was fairly certain that anyone using a program to cheat would have it set in single-line mode — in which the program quickly selects a possible move , then runs through a sequence of moves to evaluate its soundness. That is efficient, but not thorough.

Dr. Regan decided that he also needed to have his programs running in multiline mode so that he could see where and why the programs changed their evaluations. That takes much longer.

He wanted to create a model of how often the moves of players of varying ability match those of chess programs, so he began building a database by downloading and analyzing games dating to the early 19th century.

In each game, he had the computer evaluate each position in single-line mode to a depth of 13 ply (6 or 7 moves by each player). To date, he has analyzed nearly 200,000 games, including all of those from the top 50 major tournaments in history.

He also has analyzed 6,000 to 7,000 games in multiline mode to create models of different player abilities.

To test someone for cheating, he runs that player’s relative skill ranking, known as an Elo rating, against the comparative model.

The research has yielded some interesting things about how chess programs work, particularly Rybka, the strongest of the commercially available products. For example, in situations in which the evaluation of the position is particularly unfavorable, the program can get stuck looking for a solution. “I think that 47 hours is my longest stall,” Dr. Regan said.

He has also discovered that the way people play has evolved. According to his analysis, the player now ranked No. 40 in the world plays as well as Anatoly Karpov did in the 1970s, when he was world champion and was described as a machine.

Dr. Regan says his models are at a stage where they can be used only as support in cases in which cheating is alleged.

In the French case, he concluded that two performances by one of the accused players, Sébastien Feller, were outliers, meaning they had an unusually high correlation with a chess program.

The models can also be used to clear someone. At the Canadian Open last year, a player whose rating was 2,100 (a candidate master) beat three international masters, whose ratings are usually at least 300 points higher.

After analyzing the games, Dr. Regan said, “I was able to prove that his intrinsic rating was in the 2,100s and the international masters had just played poorly.”