Oxford University Computing Laboratory

Sudoku: a programming competition

Results

The competition asked for a program to solve the Sudoku puzzles that are published in British daily newspapers. There were 13 entries, written in a variety of programming languages: Java (7), C++ (3), C (1), Oberon (1), Prolog (1). In what follows, I have identified the entries by the numbers 1 to 13. The following descriptions of the programs will allow their authors to identify them:

[No]	Language	Contains class

[1]	Java		DeadendException
[2]	Java		InvalidPuzzleException
[3]	C++		Puzzle
[4]	Java		Solver
[5]	Java		WrongChoiceEx
[6]	C++		function build_hash
[7]	Oberon
[8]	Java		LargeSquare
[9]	C
[10]	Java		ClosedChain
[11]	Prolog
[12]	Java		Loader
[13]	C++		FiendishStrategy

All of the entries could be made to work on my Linux machine, sometimes after rebuilding them. Of the 13 entries, three (1, 2, 9) did not conform exactly to the specified I/O format, and a further three (8, 10, 11) did not use the standard input and output channels for I/O, but insisted on using named files. I compensated for these problems by enclosing each program in a wrapper script, using standard unix tools to adjust the I/O format where necessary. Program 6 needed the name of a predefined constant to be changed before it would compile under Linux. No penalty attached to any of these problems, because many entrants would not be able to test their programs in an environment similar to my machine. Additionally, two entries (8, 12) contained a README file only in Microsoft Word format.

Despite a search space that is (at first sight) quite large, Sudoku problems are remarkably easy to solve by a simple backtracking search, and I therefore looked for entries that applied this idea in a simple, concise and efficient way. Many of the submitted entries, some of them very effective programs, contained fewer than 300 non-blank lines of code, so I marked down entries 10 and 12 (both Java programs) that contained 1500 and 1200 non-blank lines respectively. [I checked that the excessive length was not due to over-conscientious commenting, a rare fault indeed.]

All the programs were able to solve an 'easy' puzzle in a reasonable time, but good programs are robust when presented with unusual inputs. To test for this, I ran each program on five pathological puzzles: (a) a completely blank board; (b) a board where only the bottom right cell is filled in; (c) a board that already has all 81 entries consistently filled in; (d) a board that had two 9's in the same block; (e) the following puzzle, which also has no solutions, because the squares marked X are the only places where 1's can be put in their respective columns.

1.....1..
.........
.........
.1.....1.
.........
.........
..X.....X
..2.....2
..3.....3

(a) Program 7 took 15 seconds on the blank board, but did not find a solution.

(b) Programs 9 and 13 did not find a solution to the puzzle with the last square specified, and programs 1 and 11 found a solution, but took 1.4 and 5.9 seconds respectively, much longer that other programs that solved the puzzle in under 0.1 second.

(c) On the completed board, program 12 stopped with an array bound error.

(d) Most programs stopped very quickly with no solutions for the inconsistent input, but programs 1, 3, 7, 8 and 12 each ran for a very long time before I interrupted them. Arguably, this input is not permitted by the specification, so I did not eliminate entries on this test alone.

(e) On the puzzle shown, programs 1, 3, 7 and 12 again ran for an unreasonably long time and had to be interrupted.

Following these robustness tests, five programs (2, 4, 5, 6, 8) remained, all written in Java or C++. On a puzzle labelled "fiendish" in The Times, the Java programs all took between 0.12 and 0.20 seconds, whilst the C++ programs took a time that was really too small to measure. The longer time taken by the Java programs is almost certainly accounted for by the start-up time for the Java virtual machine, so these programs were not penalized on this basis.

Next I tried the corpus of puzzles submitted by entrants, continuing to test even the programs that had failed the earlier robustness tests. Most of these revealed little difference between the entries, but the following unsolvable puzzle from entrant 2 was not handled well by programs 4, 7, 8 and 11:

.1.......
.2.......
.3.......
.4.......
.5.......
.........
.X...6789
.........
.........

(All digits are blocked from square X.)

Program 6 had particular difficulty with some other problems that could not be solved, including ironically several submitted by entrant 6 himself. For these problems, all the remaining programs (2, 5 and 6) correctly detected that there were no solutions, but program 6 took over a second to do so, despite having the advantage of being written in C++ rather than Java. Program 10 failed to solve a number of quite ordinary-looking puzzles from the corpus.

That left programs 2 and 5 as the remaining candidates for the prizes. The methods used by these two programs are very similar: both use the rules of the puzzle to fill in squares whose value is determined; if this does not lead to a solution, then they pick an arbitrary unknown cell, fill it with each possible values, and recursively try to solve the resulting grids. Program 2 is generally simpler and clearer, and the entrant provided a puzzle that defeated several other entries. On the other hand, program 5 includes a number of refinements; it has a method for inventing hard puzzles, and uses a nice, uniform treatment of the rows, columns and blocks of cells that play a part in the puzzle. Interestingly, program 2 does not implement the idea, often useful in combinatorial search problems, of backtracking on the square that has the smallest number of possibilities. This technique cuts down the branching factor of the search tree, and can dramatically speed up the search. For Sudoku, such refinements appear to be unnecessary in practice.

Since both programs have their merits, I felt it would be unfair to prefer either one over the other, and so I have decided to award the prize equally to both entrants. In arbitrary order, they are

Congratulations!

Addendum

Outside the main competition, I received informal entries from a number of colleagues. They are ineligible for the prize, but deserve a mention here, if only for the unusual methods they used. Bill Roscoe expressed the problem as a CSP process that shouts "Hurray" after completing the puzzle, and used FDR in the time-honoured way to falsify the assertion that the process can never shout "Hurray". Joel Ouaknine showed how to convert a puzzle into a Boolean formula that is satisfiable only if there is a solution, and then used a satisfiability solver to find a solution if one exists. Don Knuth observed that the Sudoku problem is an instance of a class of exact cover problems that can be formulated in terms of Boolean matrices and solved by a backtracking procedure he calls "dancing links", using pointer manipulations to cover and uncover rows and columns of the matrix.

Knuth's observation is interesting because it reveals that a number of apparently important distinctions are actually artificial. A Sudoku problem can be solved by making a sequence of choices to place a digit in a square. Each choice must be consistent with the choices that have gone before it. The choices are of four kinds:

In each case, the choice satisfies a constraint that each square must contain exactly one digit, or that each row, column or block must contain each digit exactly once.

Knuth's approach reformulates the problem in terms of finding an exact cover of a 729 x 324 Boolean matrix. The 324 columns correspond to the 4 x 81 constraints enumerated above, and each of the 729 rows corresponds to a move that puts one of the 9 digits in one of the 81 cells. Each row contains four 1's, corresponding to the way the move satisfies one constraint from each group. The problem becomes one of finding a subset of the rows that contains exactly one 1 in each column.

It's best always to select a constraint that has the smallest number of remaining possibilities. If that number is zero, then we have followed a dead end, and should backtrack. If the number is one, then we can definitely fill in a square, and there is no genuine choice. If it is more than one, then we can explore all the possibilities in a depth-first pattern.

This shows that the distinction between forced moves and arbitrary choices is artificial, since a forced move is just the special case where there is only one choice. Knuth's method also treats all four kinds of constraint uniformly, so that it can not only backtrack over all the digits that can fit in a square, but also over all the places that a given digit can be put into a row or column or block. None of the submitted solutions for the main competition can do that -- it may not be necessary for Sudoku, but it does reveal a symmetry to the problem that is rather nice.

For Knuth's method of dancing links, see this paper on his page of preprints.

-- Mike Spivey, Ascension Day 2005