coloin wrote:PatmaxDaddy wrote:Probably no one cares at this point............ and so is clearly not as clever as what is now known. But I figure a few more independent confirmations couldn't hurt, so I'll add my voice. Your number is correct.

I would be interested in how you did it........a quick outline please.

Introduction and DisclaimerThe following is a description of my solution to the Sudoku "total nuber of solutions" problem. This problem was solved by several others about five months ago, and so credit for the solution belongs to them. I offer this description only to provide independent confirmation of the result, as I derived and programmed it without any knowledge of their work. I have not in any way studied the previous work, so I do not know to what extent my solution duplicates it, but I suspect that my work adds little beyond independent confirmation. It is also clear from a cursory examination of the record that far more elegant solutions are now known.

Box AlgebraThe conventional symbols 1 - 9 are used, and the nine boxes are numbered conventionally:

1 2 3

4 5 6

7 8 9

The contents of a box is represented by a "row triplet" or a "column triplet". Either is an ordered triplet of sets of symbols. For example, a box containing

1 2 3

4 5 6

7 8 9

is represented by row triplet ({1,2,3},{4,5,6},{7,8,9}) or column triplet ({1,4,7},{2,5,8},{3,6,9}). As a shorthand I write these as (123,456,789) and (147,258,369). The purpose of a triplet is to capture the ways in which a box can constrain, or be constrained by, boxes horizontally (row triplet) or vertically (column triplet). Thus a triplet expresses constraints on box contents, not the exact contents. I may write that a box "has row triplet R" or "is constrained by row triplet R" or "is subject to row constraints R" as seems appropriate, but the meaning is the same.

The sets of a triplet may have any number of the nine symbols. For example if box 1 has row triplet (123,456,789), box 2 would be subject to row constraints (456789, 123789, 123456).

If R1 and R2 are row triplets, the operations ~R1, R1 & R2, R1 | R2 produce new row triplets whose component sets are the complement (~), intersection (&), or union (|) of the corresponding sets of R1 and R2. For example if box 1 has row triplet R1 = (123,456,789), and box 2 has row triplet R2 = (457, 189, 236), then box 3 is constrained by ~(R1 | R2) = {689, 237, 145). These operations are defined identically for column triplets, but row and column triplets cannot meaningfully be mixed.

A triplet (or constraint) where the sets have three symbols each and are mutually exclusive is called "complete". Clearly any legal box contents will correspond to complete triplets.

In many cases the six triplets corresponding to the permutations of three sets of symbols are equivalent, i.e. for a row (column) triplet the order of the rows (columns) doesn't matter. In these cases a particular one of the six triplets, called a "row class" or "column class", is chosen to represent any of the six, which are called members of the class. For row triplet R or column triplet C, the operation S(R) and S(C) produces the corresponding row class and column class. Since a class is a triplet, all triplet operations apply.

In the program a triplet is represented by a 27-bit integer, 3 fields of 9 bits, each bit signifying membership of a symbol in a set. The operation S(T) on triplet T sorts the fields in numerical order as 9-bit integers, producing a class. The operations ~, &, and | are implemented using bitwise NOT, AND, and OR.

The operation "zero cross", written ZC(R, C), is defined on a row triplet R and column triplet C as the number of disjoint pairs of sets over all nine possible pairs that can be made from R and C. For example, ZC({123,456,789}, {124,357,689}) = 2. Note that ZC(R, C) = ZC(C, R), and ZC(R, C) = ZC(S(R), S(C)).

Any box can contain 280 row or column classes, each corresponding to (3!)^4 = 1296 distinct permutations of its contents (280 * 1296 = 9!). Any box subject to complete row constraints R and complete column constraints C will have exactly one solution if ZC(R, C) = 0, and no solutions otherwise. Thus a 280x280 bit table can be constructed that tells whether a solution exists if some box is subject to certain complete row and column constraints. While a byte table would require fewer instructions in the inner loops, I consider data cache performance to be more critical and so a bit table is used. Of the 280^2 = 78400 table entries, 10080 are solutions, a ratio of 9/70 or 0.1286 (slightly more than 1 in 8).

Box 1We find all solutions where box 1 is filled in with an arbitrary but specific permutation of the nine symbols, e.g.

1 2 3

4 5 6

7 8 9

Clearly there will be 9! isomorphic solutions that can be obtained from any such solution by renaming the symbols.

Boxes 2, 3, 4, 7Boxes 2 and 3 are subject to row constraints ~(123,456,789). Each box may contain 12,096 permutations falling into one of 252 column classes, each class with 48 permutations (each of the six members of each class has eight possible permutations of symbols within each column set). Similarly, boxes 4 and 7 are subject to column constraints ~(147,258,369) and produce 252 row classes. The column classes of boxes 2 and 3, and the row classes of boxes 4 and 7, specify all of the constraints on boxes 5, 6, 8, and 9. Any specific combination of these four classes will yield a specific number of Sudoku solutions. Sum that number over all combinations of the four classes, multiply the result by (3!)^4 = 1296 for the permutations of the sets in each of the four classes, and multiply by 9! for box 1, and one has the total number of Sudoku solutions.

The possible combinations of column classes of boxes 2 and 3 are dictated by the row triplets of those boxes and box 1. The 12,096 permutations correspond to 56 row triplets, each with 216 permutations. The 48 permutations in each of the 252 column classes fall into 8 row triplets, 6 in each such triplet. Conversely, the 216 permutations in each of the 56 row triplets fall into 36 column classes, again with 6 in each class. Thus each column class of box 2 corresponds to 8 distinct row triplets Ri, 0 <= i < 8, yeilding 8 row triplets ~((123,456,789) | Ri) for box 3, and corresponding to 8 * 36 = 288 column classes Cij, 0 <= j < 36, for box 3.

But box 3 has only 252 column classes, so there are some duplicates. There are three distinct cases depending on the box 2 column class. I call these the three "straight hands". For each straight hand, there are a specific number of possible corrersponding box 3 column classes. The straight hand of a box 2 column class C is determined by ZC((123,456,789), C), as follows:

- Code: Select all
` hand number of box 3 number of box 2`

example ZC column classes hands of this type

-----------------------------------------------------

4 5 6

7 8 9 0 204 36

1 2 3

4 5 8

1 7 9 2 175 162

2 3 6

4 5 8

1 7 9 3 163 54

2 6 3

So for each box 2 column class, I only need consider 204, 175, or 163 box 3 column classes (being careful to multiply by the number of duplicates). A table is computed that gives the box 3 column classes and duplicate counts corresponding to each of the 252 box 2 column classes. On average there are 176.6 box 3 column classes to consider for each box 2 column class.

An equivalent analysis applies to boxes 4 and 7. The straight hand of a box 4 row class R is determined by ZC(R, {147, 258, 369}), and the number of box 7 row classes for each hand is given in the above table. With proper choice of class codes, the same table used for boxes 2 and 3 can be used for boxes 4 and 7.

Symmetries reduce the number of combinations of box 2 and 3 column classes, and box 4 and 7 row classes, that need to be considered by almost a factor of 8. Swapping boxes 2 and 3, or 4 and 7, clearly has no effect on the number of solutions. Similarly, swapping boxes 2 and 3 with boxes 4 and 7 has no effect if certain pairs of symbols in those boxes are swapped, specifically 2-4, 3-7, and 6-8. This symbol swapping is handled in how the row and column classes are reduced to small integer codes, and so costs no computation time in the main loops. Care must be taken in counting solutions under these symmetries, however, since in some cases boxes have the same class and so swapping yields no additional solutions.

My program can handle these symmetries in one of four different variants, depending on minor alterations in how it is coded. Two of these variants yield a total of 249,184,342 combinations of the row and column classes for boxes 2. 3, 4, and 7. The other two yield a slightly larger number, 249,292,522. All of these variants yield the exact same number of total Sudoku solutions. These numbers compare to the 247,486,752 combinations that would result if the symmetries reduced the combinations by a full factor of 8, but in fact we get around a factor of 7.95.

Boxes 5, 6, 8, and 9For each of these quarter-billion combinations of row and column classes, I need to determine how many solutions exist in boxes 5, 6, 8, and 9 subject to the constraints of those classes (and each other). Consider box 5, which is constrained by boxes 2 and 4. There are five distinct cases, corresponding to what I call the five "cross hands". The cross hand corresponding to box 2 column class C2 and box 4 row class R4 is determined by ZC(R2, C4). For each cross hand i, 0 <= i < 5, there are a specific number Hi of possible solutions for box 5.

Suppose box 2 has column class {A, B, C}, where A, B, and C are sets of three symbols. Box 2 looks like this:

a b c

a b c

a b c

or any permutation of the columns, since we're considering column classes. The a's are elements of A, b's are elements of B, and c's are elements of C. Here are the five cross hands:

- Code: Select all
` <---- permutations ---->`

i box 4 ZC Hi Hi/8 cases rows row1 row2 row3 sets total

----------------------------------------------------------------

a b c

0 a b c 0 448 56 8236 1 6 6 6 1 216

a b c

a a b

1 a b c 2 400 50 36756 6 3 6 3 3 972

b c c

a a b

2 a c c 3 392 49 12204 6 3 3 3 2 324

b c b

a a b

3 a b b 4 384 48 6084 6 3 3 1 3 162

c c c

a a a

4 b b b 6 432 54 224 6 1 1 1 1 6

c c c ----- ----

63504 = 252^2 ((3!)^3) * 1680 = 9!

The numbers under "permutations" show that all 9! permutations of box 4 are accounted for. Details can be supplied by the reader.

For box 2 column class C2 and box 4 row class R4, of cross hand i, each of the Hi permutations of box 5 will have a column triplet C5 and a row triplet R5. Therefore box 8 must have column class C8 = S(~(C2 | C5)) and box 6 must have corresponding row class R6 = S(~(R4 | R5)). For each C2 and R4, there are Hi/8 column classes C8, each corresponding to 8 row classes R6. Symmetrically there are Hi/8 row classes R6, each corresponding to 8 column classes C8.

For hands 0, 1, and 2 there are a very small number of duplicates, so that there is one fewer distinct column class C8 and row cless R6. For example hand 0 gives 55 distinct column classes C8, with one of them having 16 corresponding row classes R6. This could be exploited to reduce slightly the number of inner loop iterations needed, but it would add a test to the loop, whixh would cost much more than the slight savings.

An equivalent analysis shows that for box 3 column class C3 and box 7 row class R7, of cross hand j, each of the Hj permutations of box 9 will have a column triplet C9 and a row triplet R9. Box 8 has row class R8 = S(~(R7 | R9)) and box 6 has corresponding column class C6 = S(~(C3 | C9)). These fall into Hj/8 sets of 8 as above.

There will be exactly one solution if ZC(R6, C6) = 0 and ZC(R8, C8) = 0. For each C2, C3, R4, and R7 we can check (Hi/8)*(Hj/8) (R8,C8) combinations for box 8. For each of the approximately 1 in 8 combinations where ZC(R8, C8) = 0, we can then count the 64 corresponding combinations of (R6,C6) where ZC(R6, C6) = 0. The counting uses the 280x280 bit table described above. The innermost loop is five machine instructions, hand-ordered to keep the P4 execution units reasonably busy. With tables organized for good data cache performance, the inner loop averages about 7.6 clock ticks, including all outer loop overhead. Total execution time is around 5.5 hours at 2GHz.

Here are the results for my two distinct symmetry variants:

- Code: Select all
`C2, C3, R4, R7 inner loop solutions solutions after`

combinations iterations counted symmetry bookkeeping

-------------------------------------------------------------------------

249,184,342 5,211,549,123,584 669,756,288,560 14,184,585,201,152

249,292,522 5,214,506,674,752 670,055,723,020 14,184,585,201,152

Finally, 14,184,585,201,152 * 9! * (3!)^4 = 6,670,903,752,021,072,936,960.