Encoding any Sudoku (any size, any jigsaw template) is simply a matter of generating the appropriate sequence of ExactlyOneOf(x1, x2, x3, ... xn) formulae, one for each cell, each row, and each region.

Here are 3 different encoding methods, in addition to the naive method (aka binomial), with the number of clauses required for various N, where N is the number of choices in the list x:

- Code: Select all
`N 6 7 8 9 10 11 12 13 14 15 16`

------------------------------------------------

Naive 16 22 29 37 46 56 67 79 92 106 121

Bin 16 19 23 27 31 36 40 43 47 51 55

Tern 16 18 21 25 28 32 35 39 44 46 51

SAD 14 17 20 23 26 29 32 35 38 41 44

N 25 36 49 64 81 100 121 144 169 196 225 256

----------------------------------------------------------------

Naive 301 631 1177 2017 3241 4951 7261 10k 14k 19k 25k 32k

Bin 91 135 187 247 316 391 475 567 668 775 891 1015

Tern 81 119 165 217 277 343 417 499 585 679 781 889

SAD 71 104 143 188 239 296 359 428 503 584 671 764

Bin encoding splits the list into pairs and treats them like a binary tree. Tern is ternary, which does the same thing but in groups of 3 (the combinatorial advantage of these approaches evaporates beyond 3).

SAD is appropriately named as we will see shortly. On the surface it is one of those simple but so elegant methods that just takes your breath away. The method simulates a Serial Adder circuit, that "counts" the true values in the list and produces true if that count is 0 or 1, otherwise false (ie > 1 value is true). Combine that with a single clause that tells you whether any of the values are true, and hey presto.

The encoding itself is linear, while the other 2 methods require some recursion. Neat, huh? Clause count is 3N - 4.

The really sad part is that every invocation requires N - 1 extra variables, so in practice, while reducing the clauses count, it turns out to be no faster than binary encoding.

Since ternary encoding uses the same principle as binary, but with less clauses and less extra variables (~log3(N)), it is almost always the best performing method in practice.