The SuDoku Puzzle as a Satisfiability Problem

The Satisfiability problem has long been regarded as one of the classical problems in Computer Science. Each year there is a competition among solvers. SuDoku is a popular puzzle, instances of which are published in many newspapers and magazines, and for which commercial programs are available. This tool permits the user to
  • Generate puzzles
  • Enter puzzles by hand
  • Copy/paste puzzles from most standard formats. As an unsigned applet does not have direct access to the clipboard, first paste a puzzle into the "Paste area" and then choose Edit/Paste puzzle.
  • Protect entered values from accidental overwriting
  • Solve the puzzle manually and check the solution
  • Solve the puzzle automatically
  • The puzzle size can be 4x4, 9x9 (the default) or 16x16. The solutions are found by converting the puzzle into a satisfiability problem, and this is displayed using the competition format. This is solved using the sat4j package and the solution model is also displayed. It is possible to copy the satisfiability problem and paste it into a text editor to be solved by your own satisfiability solver, and if the solution is pasted into the tool it can be interpreted as a solution to the SuDoku puzzle - see the "cnf" tab. A Java application is available here. Note that your browser my rename this sudoku.jar file to sudoku.zip. It permits larger puzzles, but be warned that the 36x36 puzzles may take some time to create and solve. To use larger puzzles increase the amount of heap space by using the command
    java -Xmx300m -jar sudoku.jar
    The Java appliction can also write cnf files and read the corresponding models and copy and paste puzzles directly.

    The source code is also available from sat4j.

    To play SuDoku

    Initially the grid is empty. You can enter your own fixed values, for example to solve a given problem, by clicking on cells and typing. Alternatively you can have a problem generated for you by clicking "Create". The number of cells which are filled automatically for you can be altered, or you can tick "Unique", in which case enough cells will be filled in to guarantee that there is only one possible solution. Clicking on "Protect" will then prevent you from overwriting these values accidentally as you try to complete the puzzle in such a way that
  • Each cell contains an integer in the range 1 to 9
  • No two cells in any row contain the same value
  • No two cells in any column contain the same value
  • No two cells in any separated 3x3 block contain the same value
  • Extra constraints (described here) are used if you tick "Complete".

    When you think that you have finished, clicking "Check" will check your solution. If you get stuck you can ask the program to complete the puzzle for you by clicking "Solve". This will not change any entered values, protected or not. It may be that it is not possible to continue with what you have entered, in which case you will be informed. The puzzles created automatically can always be solved. If you have clicked "Solve" and wish to remove entries added by the computer click "Clear computers". You can click "Clear mine too" at any time to clear all the cells which have not been protected. "Unprotect" will enable you to clear the whole grid.

    The generated satisfiability problem

    The SuDoku puzzle is converted to one in Conjunctive normal form, expressed in the form required for the SAT competitions. To view this choose the "CNF" tab. When either "Full cnf" or "Simpler cnf" is clicked, this problem is displayed in the "cnf problem" window. "Show model" will display the cnf solution. Nine boolean variables are allocated to each cell, to represent the cell containing the nine possible values. The variable which indicates that the cell in row r and column c has the value v is r*100 + c*10 + v. The value is false if negative. Thus, for example,

    123 indicates that row 1, column 2 contains 3.
    -456 indicates that row 4, column 5 does not contain 6.
    The first line of the problem is of the form "p cnf num-of-variables num-of-clauses".

    Individual cell clauses

    Clauses have to be included to indicate that a cell contains exactly one value in the range 1 to 9. Consider the cell <1,1>. The clause
    111 112 113 114 115 116 117 118 119 0
    indicates that it contains a least one such value (the 0 is a terminator). To show that it contains only one value needs a series of clauses of the form
    -111 -112 0 (can't be both 1 and 2)
    -111 -113 0 (can't be both 1 and 3)

    These have to be repeated for each of the 81 cells.

    Row clauses

    In the same manner, to show that row 1 contains a 1, we need
    111 121 131 141 151 161 171 181 191 0
    and to show that it contains a 2 we need
    112 122 132 142 152 162 172 182 192 0
    This is repeated for values 3 to 9. Then, to show that it does not contain more than one 1, we need
    -111 -121 0 (first two squares aren't both 1)
    -111 -131 0 (first and third aren't both 1)

    This is repeated for the values in the range 2 to 9.

    Column clauses

    In the same manner, to show that column 1 contains a 1, we need
    111 211 311 411 5111 611 7111 8111 911 0
    and to show that it contains a 2 we need
    112 212 312 412 512 612 712 812 912 0
    This is repeated for values 3 to 9. Then, to show that it does not contain more than one 1, we need
    -111 -211 0 (first two squares aren't both 1)
    -111 -311 0 (first and third aren't both 1)

    This is repeated for the values in the range 2 to 9.

    Block clauses

    Again, these are similar. For example to show that the top left block contains a 1, we need
    111 121 131 211 221 231 311 321 331 0

    Pre-filled cells

    This completes the clauses which are the same for every puzzle because they reflect the basic rules of the game. We finally need to add the clauses which specify the initial cells. For example if cell <2,3> has to contain 4, we add the clause
    234 0

    Solving the problem

    When the cnf problem has been created, it is solved using the MiniLearning solver contained in the sat4j java library. The solution is displayed in the "solution" window. If the first line is "s UNSAT" the problem has no solution. Otherwise, the first line is "s SAT" and subsequent lines give the required values of the variables (known as the "model"). This is used to complete the SuDoku puzzle. For example if the solution contains 345, then cell <3,4> is completed with a 5.

    Using your own satisfiability solver

    This program can be used for testing satisfiability solvers. It is possible to copy the text from the problem window (use CTRL-A and then CTRL-C) and paste into your own editor to save and run a solver on. Similarly, the output from your solver can be pasted into the solution window (CTRL-V) and then "Interpret Model" will complete the grid accordingly. It it the possible to use "Check" to verify the solution.

    It should be noted that the clauses described above contain considerable redundancy. For example, if it is guaranteed that row 1 (with 9 cells) contains all the values in the range 1 to 9, then it is not necessary to specify in addition that the same value cannot appear twice. Taking this into account results in a very much smaller, but logically equivalent, cnf problem which can be created using the "Simpler cnf" button. However, the solver used in this program is very much slower when applied to the simpler cnf problem, and so it is only made available for you to try on your own solver.

    The encoding for the variables is not the most efficient possible, but is used for the 4x4 and 9x9 cases because it makes the clauses more understandable. The encoding breaks down for the 16x16 problem because r, c and v can all exceed 9. It would be possible to use 10000*r + 100*c + v, but that is so very inefficient that the default heap space for the java virtual machine is not sufficient for the solver. Instead, the encoding used is 17*17*r + 17*c + v.

    Comment

    Please direct any comments or questions to Ivor Spence