Su-Doku's maths

Everything about Sudoku that doesn't fit in one of the other sections

Collating our results

Postby Red Ed » Sat Oct 15, 2005 10:17 pm

I thought it might be a good idea if we tried to collect all our "headline" results, especially the various grid-counts, into a single web page. I just mean the key facts, not details of the methods used.

I took the liberty of setting up something on Wikipedia. Any thoughts / contributions ...?
Red Ed
 
Posts: 633
Joined: 06 June 2005

Postby dukuso » Sun Oct 16, 2005 7:32 am

hi Ed,

A summary with the results is a good idea.
I doubt that Wikipedia is the right place, though.
They have a complicated and lengthy "copyleft", which I don't
yet understand.
I mean, you can't require potential readers to walk through all that
stuff before they are allowed to read/copy/quote a Wikipedia article.

Can't we put it on some webpage (yours,Frazer's,..)without html,
just text, and everybody can edit it ?
But before the old text is replaced, someone with "moderator"
status must approve the new text, else we might get too much
nonsense.
You can still put a link to that page from Wikipedia.

Maybe we can include Frazer's or Wikipedia pages and (edited,shortened)
articles from the forum directly. Or are there problems
with copyright/left ? Then we'd need some newly compiled
articles anyway, and maybe should stop posting here but
create a new forum/group/mailing list.


-Guenter.
dukuso
 
Posts: 479
Joined: 25 June 2005

Postby Red Ed » Sun Oct 16, 2005 1:34 pm

You might be right about Wikipedia's copyleft policy being a pain. It seems stupid to put copyleft on Wikipedia content that does nothing more than reflect this (or possibly another) forum.

That said, it is a convenient place to collect all our results together in a way that we can all contribute to. So I think for the time being I'll persevere and keep adding stuff to the new Wiki page -- suggestions/contributions would be welcome.

If we decide later on that the material should go somewhere else then at least it'll be easy to see what needs to be copied across, even if the wording should be tweaked to avoid copyleft issues (which, like you, I still don't fully understand ...).

PS: I've confirmed kjellfp's figure for the number of 2x3 sudokus, and I think I've calculated that the number of 2x4 sudokus is 8! * 722631131136 = 2.91e16 (but this needs checking).
Red Ed
 
Posts: 633
Joined: 06 June 2005

Postby kjellfp » Mon Oct 17, 2005 4:11 pm

Red Ed wrote:...and I think I've calculated that the number of 2x4 sudokus is 8! * 722631131136 = 2.91e16 (but this needs checking).


Confirmed. For a prime factorization, you have

722631131136 = 2^18 * 3^3 * 23^2 * 193
kjellfp
 
Posts: 140
Joined: 04 October 2005

Postby dukuso » Tue Oct 18, 2005 4:46 am

referring to thread:

http://www.setbb.com/phpbb/viewtopic.php?t=194&mforum=sudoku


walking throught the 11555445688 sudokus from the 306993 G-classes,
which by design should intersect each S-class at least once, I found indeed
23919 different S-classes which are T-equivalent to their transpose.
The list is here:
http://magictour.free.fr/t-invar.gz
I found 26641 S-classes where the 3*9-horizontal bands
are pairwise isomorphic to the vertical 9*3-stacks.

edited 17.Oct. My earlier calculation of 819 t-invariant classes
was wrong.


this indirectly confirms Frazer's and RedEd's numbers of
S-classes, T-classes and T-invariants.

-Guenter.
dukuso
 
Posts: 479
Joined: 25 June 2005

Postby Red Ed » Tue Oct 18, 2005 6:54 pm

Great, it's nice to see that confirmed (and the 2x4 sudokus calculation too).

In return, I was able to verify your 31672366418991513600 for the number of 3x4 bands. It'll be a bit more work to verify kjellfp's 4x3 bands number (he'll probably solve the whole 4x3 sudoku problem before I verify the nr bands!).

EDIT: okay, done it now.
Last edited by Red Ed on Thu Oct 20, 2005 2:12 pm, edited 1 time in total.
Red Ed
 
Posts: 633
Joined: 06 June 2005

yet another independent confirmation

Postby PatmaxDaddy » Tue Oct 18, 2005 10:42 pm

Probably no one cares at this point, but I have yet another independent confirmation of 6,670,903,752,021,072,936,960 as the number of solutions. I wasn't aware of any of this work, but when I got my result I typed the number into Google to see if it was known and found this thread. I've skimmed the thread and haven't had time to understand the discussions in detail, but I suspect that my method probably adds nothing to what is already known. My program takes 5.5 hours on a 2GHz Pentium (with careful assembly coding of the inner loops), and so is clearly not as clever as what is now known. But I figure a few more independent confirmations couldn't hurt, so I'll add my voice. Your number is correct.
PatmaxDaddy
 
Posts: 63
Joined: 18 October 2005

Postby kjellfp » Wed Oct 19, 2005 8:07 am

Thanks for the 4x3 band number confirmation.

My PC has counted the 5x2 sudokus. Unfortunately something went wrong when writing the output, som I am rerunning now.

Counting the 4x2-sudokus took about a second. Counting the 5x2-sudokus took ~10 hours.

6x2, anybody?:)

I also believe that 4x3 is harder than 5x2, just like 3x3 was harder than 4x2. But I am going to try 4x3 when I have time. The ideas are there, but it takes time to write the code when you have a job and small children...

I'll try to post the 5x2-result later today.
kjellfp
 
Posts: 140
Joined: 04 October 2005

Re: yet another independent confirmation

Postby coloin » Wed Oct 19, 2005 9:20 am

PatmaxDaddy wrote:Probably no one cares at this point............ and so is clearly not as clever as what is now known. But I figure a few more independent confirmations couldn't hurt, so I'll add my voice. Your number is correct.


I would be interested in how you did it........a quick outline please.
coloin
 
Posts: 1637
Joined: 05 May 2005

Re: yet another independent confirmation

Postby PatmaxDaddy » Wed Oct 19, 2005 11:21 am

coloin wrote:I would be interested in how you did it........a quick outline please.


I would be delighted to write an outline, give me a few days to get it together. I'm only doing this in my spare time, which is spare indeed.

On the issue of the minimum number of clues necessary, it strikes me that one needs 72.5 bits of information to specify a unique solution given the 6,670,903,752,021,072,936,960 choices. There are many fewer essentially distinct solutions, of course, but to fill in the grid one has to specify which specific member of each equivalence class, so one still needs all 72.5 bits. The very first clue one chooses clearly provides 3.17 bits, but subsequent clues usually provide fewer, since as one adds clues there are often no longer 9 possible choices for a blank cell. So it's going to be very hard to make a puzzle that approaches ceil(72.5/3.16) = 23 clues. However, a clue can provide more than 3.17 bits depending on what number is closen for the cell, because the number of solutions corresponding to each possible choice for the clue are unequal. So some choices provide more bits than others. But to get below 23 clues one would need to average more than 3.17 bits/clue, which seems like it would be quite a challange.

That said, I vaguely remember reading in this very long thread about some puzzles that used only 17 or 18 clues, so either my analysis is wrong, or someone got very clever.
PatmaxDaddy
 
Posts: 63
Joined: 18 October 2005

Postby kjellfp » Thu Oct 20, 2005 4:35 am

I have done the 5x2-sudoku grids counting.

The answer is 1903816047972624930994913280000, or 1.9e30

Factorization:
10! * 4! * 5! * 2^18 * 3 * 5 * 131 * 35603 * 9932999

The 5*10-bands were grouped into 355 equivalence classes (just as the 44 classes for 3x3-sudoku).
kjellfp
 
Posts: 140
Joined: 04 October 2005

Postby student » Sat Oct 22, 2005 3:18 pm

hello to all! as my username indicates, I am indeed a high-school student, and I am currently attempting to write a 10 page paper for math class on sudoku (the assignment was to write about anything that interested you that also had to do with mathematics). Naturally, in my research on the internet, I came across the fact that Bertram was able to derive the correct number of potential sudoku grids.

Bertram wrote:The result of the computation is

6,670,903,752,021,072,936,960 = 2^15 * 3^8 * 5*7 * 27,704,267,971 = 9! * 72^2 * 2^7 * 27,704,267,971


My problem is, that being young and relatively uneducated, I can't even begin to figure out what any of those numbers represent. If anyone could explain the above expression in the lowest of layman's terms, I would really appreciate it.

Thanks
student
 
Posts: 1
Joined: 22 October 2005

Re: Collating our results - Wikipedia

Postby LarryLACa » Sun Oct 23, 2005 7:41 pm

Red Ed wrote:I thought it might be a good idea if we tried to collect all our "headline" results, especially the various grid-counts, into a single web page. I just mean the key facts, not details of the methods used.

I took the liberty of setting up something on Wikipedia. Any thoughts / contributions ...?


I had been advocating on the main Wikipedia Sudoku talk page Talk:Sudoku for the addition of a Sudoku Math's page. But I'm relatively new to contributing to Wikipedia and had been collecting material before launching the page. Thanks Ed for being bold in the Wiki spirit and getting it done.

dukuso wrote:I doubt that Wikipedia is the right place, though.
They have a complicated and lengthy "copyleft", which I don't yet understand.
I mean, you can't require potential readers to walk through all that
stuff before they are allowed to read/copy/quote a Wikipedia article.


I think the Pros outweigh the Cons for using Wikipedia to summarize results. Wikipedia is a very open collaboration mechanism. Anyone can edit or contribute, whether or not they've read through the "copyleft" policy statement (which is only the tip of the Wikipedia policy maze).
Wikipedia provides an established service (as in immediately available) and a well known web focus. The latter being salient to fostering exposure and synergy.

Red Ed wrote:If we decide later that the material should go somewhere else then at least it'll be easy to see what needs to be copied across, even if the wording should be tweaked to avoid copyleft issues (which, like you, I still don't fully understand ...).


Obviously anyone wishing to privatize their intellectual property should review the "copyleft" implications. But I don't think those folks are contributing here. Relocating later shouldn't be a major issue either, as Ed notes.

Frazer/Bertram: Since you've already invested effort in academic publishing on Suduko, if you have any particular concerns about using Wikipedia, let us know.
LarryLACa
 
Posts: 32
Joined: 24 August 2005

Re: yet another independent confirmation

Postby PatmaxDaddy » Mon Oct 24, 2005 3:54 am

coloin wrote:
PatmaxDaddy wrote:Probably no one cares at this point............ and so is clearly not as clever as what is now known. But I figure a few more independent confirmations couldn't hurt, so I'll add my voice. Your number is correct.


I would be interested in how you did it........a quick outline please.


Introduction and Disclaimer
The following is a description of my solution to the Sudoku "total nuber of solutions" problem. This problem was solved by several others about five months ago, and so credit for the solution belongs to them. I offer this description only to provide independent confirmation of the result, as I derived and programmed it without any knowledge of their work. I have not in any way studied the previous work, so I do not know to what extent my solution duplicates it, but I suspect that my work adds little beyond independent confirmation. It is also clear from a cursory examination of the record that far more elegant solutions are now known.

Box Algebra
The conventional symbols 1 - 9 are used, and the nine boxes are numbered conventionally:
1 2 3
4 5 6
7 8 9

The contents of a box is represented by a "row triplet" or a "column triplet". Either is an ordered triplet of sets of symbols. For example, a box containing
1 2 3
4 5 6
7 8 9

is represented by row triplet ({1,2,3},{4,5,6},{7,8,9}) or column triplet ({1,4,7},{2,5,8},{3,6,9}). As a shorthand I write these as (123,456,789) and (147,258,369). The purpose of a triplet is to capture the ways in which a box can constrain, or be constrained by, boxes horizontally (row triplet) or vertically (column triplet). Thus a triplet expresses constraints on box contents, not the exact contents. I may write that a box "has row triplet R" or "is constrained by row triplet R" or "is subject to row constraints R" as seems appropriate, but the meaning is the same.

The sets of a triplet may have any number of the nine symbols. For example if box 1 has row triplet (123,456,789), box 2 would be subject to row constraints (456789, 123789, 123456).

If R1 and R2 are row triplets, the operations ~R1, R1 & R2, R1 | R2 produce new row triplets whose component sets are the complement (~), intersection (&), or union (|) of the corresponding sets of R1 and R2. For example if box 1 has row triplet R1 = (123,456,789), and box 2 has row triplet R2 = (457, 189, 236), then box 3 is constrained by ~(R1 | R2) = {689, 237, 145). These operations are defined identically for column triplets, but row and column triplets cannot meaningfully be mixed.

A triplet (or constraint) where the sets have three symbols each and are mutually exclusive is called "complete". Clearly any legal box contents will correspond to complete triplets.

In many cases the six triplets corresponding to the permutations of three sets of symbols are equivalent, i.e. for a row (column) triplet the order of the rows (columns) doesn't matter. In these cases a particular one of the six triplets, called a "row class" or "column class", is chosen to represent any of the six, which are called members of the class. For row triplet R or column triplet C, the operation S(R) and S(C) produces the corresponding row class and column class. Since a class is a triplet, all triplet operations apply.

In the program a triplet is represented by a 27-bit integer, 3 fields of 9 bits, each bit signifying membership of a symbol in a set. The operation S(T) on triplet T sorts the fields in numerical order as 9-bit integers, producing a class. The operations ~, &, and | are implemented using bitwise NOT, AND, and OR.

The operation "zero cross", written ZC(R, C), is defined on a row triplet R and column triplet C as the number of disjoint pairs of sets over all nine possible pairs that can be made from R and C. For example, ZC({123,456,789}, {124,357,689}) = 2. Note that ZC(R, C) = ZC(C, R), and ZC(R, C) = ZC(S(R), S(C)).

Any box can contain 280 row or column classes, each corresponding to (3!)^4 = 1296 distinct permutations of its contents (280 * 1296 = 9!). Any box subject to complete row constraints R and complete column constraints C will have exactly one solution if ZC(R, C) = 0, and no solutions otherwise. Thus a 280x280 bit table can be constructed that tells whether a solution exists if some box is subject to certain complete row and column constraints. While a byte table would require fewer instructions in the inner loops, I consider data cache performance to be more critical and so a bit table is used. Of the 280^2 = 78400 table entries, 10080 are solutions, a ratio of 9/70 or 0.1286 (slightly more than 1 in 8).

Box 1
We find all solutions where box 1 is filled in with an arbitrary but specific permutation of the nine symbols, e.g.
1 2 3
4 5 6
7 8 9

Clearly there will be 9! isomorphic solutions that can be obtained from any such solution by renaming the symbols.

Boxes 2, 3, 4, 7
Boxes 2 and 3 are subject to row constraints ~(123,456,789). Each box may contain 12,096 permutations falling into one of 252 column classes, each class with 48 permutations (each of the six members of each class has eight possible permutations of symbols within each column set). Similarly, boxes 4 and 7 are subject to column constraints ~(147,258,369) and produce 252 row classes. The column classes of boxes 2 and 3, and the row classes of boxes 4 and 7, specify all of the constraints on boxes 5, 6, 8, and 9. Any specific combination of these four classes will yield a specific number of Sudoku solutions. Sum that number over all combinations of the four classes, multiply the result by (3!)^4 = 1296 for the permutations of the sets in each of the four classes, and multiply by 9! for box 1, and one has the total number of Sudoku solutions.

The possible combinations of column classes of boxes 2 and 3 are dictated by the row triplets of those boxes and box 1. The 12,096 permutations correspond to 56 row triplets, each with 216 permutations. The 48 permutations in each of the 252 column classes fall into 8 row triplets, 6 in each such triplet. Conversely, the 216 permutations in each of the 56 row triplets fall into 36 column classes, again with 6 in each class. Thus each column class of box 2 corresponds to 8 distinct row triplets Ri, 0 <= i < 8, yeilding 8 row triplets ~((123,456,789) | Ri) for box 3, and corresponding to 8 * 36 = 288 column classes Cij, 0 <= j < 36, for box 3.

But box 3 has only 252 column classes, so there are some duplicates. There are three distinct cases depending on the box 2 column class. I call these the three "straight hands". For each straight hand, there are a specific number of possible corrersponding box 3 column classes. The straight hand of a box 2 column class C is determined by ZC((123,456,789), C), as follows:

Code: Select all
 hand          number of box 3   number of box 2
example   ZC   column classes   hands of this type
-----------------------------------------------------
 4 5 6
 7 8 9     0         204                36
 1 2 3

 4 5 8
 1 7 9     2         175               162
 2 3 6

 4 5 8
 1 7 9     3         163                54
 2 6 3

So for each box 2 column class, I only need consider 204, 175, or 163 box 3 column classes (being careful to multiply by the number of duplicates). A table is computed that gives the box 3 column classes and duplicate counts corresponding to each of the 252 box 2 column classes. On average there are 176.6 box 3 column classes to consider for each box 2 column class.

An equivalent analysis applies to boxes 4 and 7. The straight hand of a box 4 row class R is determined by ZC(R, {147, 258, 369}), and the number of box 7 row classes for each hand is given in the above table. With proper choice of class codes, the same table used for boxes 2 and 3 can be used for boxes 4 and 7.

Symmetries reduce the number of combinations of box 2 and 3 column classes, and box 4 and 7 row classes, that need to be considered by almost a factor of 8. Swapping boxes 2 and 3, or 4 and 7, clearly has no effect on the number of solutions. Similarly, swapping boxes 2 and 3 with boxes 4 and 7 has no effect if certain pairs of symbols in those boxes are swapped, specifically 2-4, 3-7, and 6-8. This symbol swapping is handled in how the row and column classes are reduced to small integer codes, and so costs no computation time in the main loops. Care must be taken in counting solutions under these symmetries, however, since in some cases boxes have the same class and so swapping yields no additional solutions.

My program can handle these symmetries in one of four different variants, depending on minor alterations in how it is coded. Two of these variants yield a total of 249,184,342 combinations of the row and column classes for boxes 2. 3, 4, and 7. The other two yield a slightly larger number, 249,292,522. All of these variants yield the exact same number of total Sudoku solutions. These numbers compare to the 247,486,752 combinations that would result if the symmetries reduced the combinations by a full factor of 8, but in fact we get around a factor of 7.95.

Boxes 5, 6, 8, and 9
For each of these quarter-billion combinations of row and column classes, I need to determine how many solutions exist in boxes 5, 6, 8, and 9 subject to the constraints of those classes (and each other). Consider box 5, which is constrained by boxes 2 and 4. There are five distinct cases, corresponding to what I call the five "cross hands". The cross hand corresponding to box 2 column class C2 and box 4 row class R4 is determined by ZC(R2, C4). For each cross hand i, 0 <= i < 5, there are a specific number Hi of possible solutions for box 5.

Suppose box 2 has column class {A, B, C}, where A, B, and C are sets of three symbols. Box 2 looks like this:
a b c
a b c
a b c
or any permutation of the columns, since we're considering column classes. The a's are elements of A, b's are elements of B, and c's are elements of C. Here are the five cross hands:

Code: Select all
                                   <---- permutations ---->
i  box 4   ZC    Hi   Hi/8  cases  rows row1 row2 row3 sets total
----------------------------------------------------------------
   a b c
0  a b c    0    448   56    8236    1    6    6    6    1   216
   a b c

   a a b
1  a b c    2    400   50   36756    6    3    6    3    3   972
   b c c

   a a b
2  a c c    3    392   49   12204    6    3    3    3    2   324
   b c b

   a a b
3  a b b    4    384   48    6084    6    3    3    1    3   162
   c c c

   a a a
4  b b b    6    432   54     224    6    1    1    1    1     6
   c c c                    -----                           ----
                            63504 = 252^2        ((3!)^3) * 1680 = 9!

The numbers under "permutations" show that all 9! permutations of box 4 are accounted for. Details can be supplied by the reader.

For box 2 column class C2 and box 4 row class R4, of cross hand i, each of the Hi permutations of box 5 will have a column triplet C5 and a row triplet R5. Therefore box 8 must have column class C8 = S(~(C2 | C5)) and box 6 must have corresponding row class R6 = S(~(R4 | R5)). For each C2 and R4, there are Hi/8 column classes C8, each corresponding to 8 row classes R6. Symmetrically there are Hi/8 row classes R6, each corresponding to 8 column classes C8.

For hands 0, 1, and 2 there are a very small number of duplicates, so that there is one fewer distinct column class C8 and row cless R6. For example hand 0 gives 55 distinct column classes C8, with one of them having 16 corresponding row classes R6. This could be exploited to reduce slightly the number of inner loop iterations needed, but it would add a test to the loop, whixh would cost much more than the slight savings.

An equivalent analysis shows that for box 3 column class C3 and box 7 row class R7, of cross hand j, each of the Hj permutations of box 9 will have a column triplet C9 and a row triplet R9. Box 8 has row class R8 = S(~(R7 | R9)) and box 6 has corresponding column class C6 = S(~(C3 | C9)). These fall into Hj/8 sets of 8 as above.

There will be exactly one solution if ZC(R6, C6) = 0 and ZC(R8, C8) = 0. For each C2, C3, R4, and R7 we can check (Hi/8)*(Hj/8) (R8,C8) combinations for box 8. For each of the approximately 1 in 8 combinations where ZC(R8, C8) = 0, we can then count the 64 corresponding combinations of (R6,C6) where ZC(R6, C6) = 0. The counting uses the 280x280 bit table described above. The innermost loop is five machine instructions, hand-ordered to keep the P4 execution units reasonably busy. With tables organized for good data cache performance, the inner loop averages about 7.6 clock ticks, including all outer loop overhead. Total execution time is around 5.5 hours at 2GHz.

Here are the results for my two distinct symmetry variants:

Code: Select all
C2, C3, R4, R7     inner loop          solutions       solutions after
 combinations      iterations           counted      symmetry bookkeeping
-------------------------------------------------------------------------
  249,184,342   5,211,549,123,584   669,756,288,560   14,184,585,201,152
  249,292,522   5,214,506,674,752   670,055,723,020   14,184,585,201,152

Finally, 14,184,585,201,152 * 9! * (3!)^4 = 6,670,903,752,021,072,936,960.
Last edited by PatmaxDaddy on Tue Oct 25, 2005 6:25 pm, edited 1 time in total.
PatmaxDaddy
 
Posts: 63
Joined: 18 October 2005

Postby dukuso » Mon Oct 24, 2005 6:57 am

hmm, seems that you did split the board into B1B2B3B4B7 and the rest.
That reminds me...
It was Colin's idea, he considered this partition useful and urged me
to do the calculation. Well, it was interesting to find the 2802 classes,
but the partition into bands seemed more useful though.

I haven't yet read and analyzed your whole message,
but I couldn't find that 2802 in it ...

I don't remember whether we had used that calculation
to confirm the 6.67e21 number.
It should be in the archive here, but not searchable since this
thread is so long :-(

-Guenter
dukuso
 
Posts: 479
Joined: 25 June 2005

PreviousNext

Return to General