Su-Doku's maths

Everything about Sudoku that doesn't fit in one of the other sections

Re: large

Postby LarryLACa » Wed Nov 30, 2005 8:51 pm

cole07 wrote:how are you guys getting such large numbers?

I'm doing the math and it's not going to need a power...
....
Am I missing somthing


I don't understand your representation well enough to respond clearly.
There are 9! (9*8*...*1) options for filling in row 1 alone.

See theWikipedia - Mathematics of Sudoku article for an explanation of the methods developed here.
LarryLACa
 
Posts: 32
Joined: 24 August 2005

how about this?

Postby Guest » Thu Dec 01, 2005 12:40 am

Hello all.

I admit reading only about a dozen of the posts in this thread so forgive me if I’m repeating an idea some1 else already posted.

I think the answer is simply 9!*8!*7!*6!*5!*4!*3!*2!

I look at it on a cell by cell basis.
There are 9 possible numbers for the first cell , 8 for the second etc. (9!)
There only 8 possible numbers for the first cell on the second line (the cell above it omits one possibility) , there are 7 for the second cell on the second line because the left cell omits one and the cell above it omits one). so its 8! for the second line , etc.
So this is the table of possibilities per cell in my opinion.

9 8 7 6 5 4 3 2 1

8 7 6 5 4 3 2 1 1

7 6 5 4 3 2 1 1 1

6 5 4 3 2 1 1 1 1

5 4 3 2 1 1 1 1 1

4 3 2 1 1 1 1 1 1

3 2 1 1 1 1 1 1 1

2 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1


I think the box context is irrelevant if you calculate it cell by cell.
Would love to hear your remarks.

Macaroni.
Guest
 
Posts: 312
Joined: 25 November 2005

Re: how about this?

Postby gfroyle » Thu Dec 01, 2005 4:44 am

Anonymous wrote:I think the box context is irrelevant if you calculate it cell by cell.
Would love to hear your remarks.
Macaroni.


You are not counting Sudokus.

For example, suppose the first row has been completed, and you are considering the number of choices for the position labelled by *

Code: Select all
1 2 3  4 5 6  7 8 9
*


Your diagram shows 8 choices for that position. But there aren't 8 choices, because the * cannot be a 1, but NOR CAN IT BE 2 or 3 because of the box constraints. So there are 6 choices.

But that is only the start of the problems - for later boxes, the number of choices depends not only on the position, but also WHICH choices of numbers were made earlier.


Gordon
gfroyle
 
Posts: 214
Joined: 21 June 2005

Kjell/Ed method

Postby PatmaxDaddy » Fri Dec 02, 2005 4:33 am

Based on a brief description by Kjell a few weeks back, I've duplicated the fast method of Kjell and Ed for computing N(Bertram). This is a great method! Not only is it very fast, but the code is much simpler than the B12347 methods I tried. My running time is about 0.25 sec on a 2GHz Pentium (Microsoft C++ compiler), not as good as Kjell and Ed have reported, but I haven't worked too hard on tightening it up.

One thing I may be missing is the "416-band groups", references to which I keep seeing in this thread. I don't generate any such set of equivalence classes, I just work with the so-called gang of 44. Perhaps I need to understand these 416 classes in order to get below 100 ms as Kjell and Ed have done?

I might have time at some point to write a more detailed description of this method (or at least the version I've coded) than appears on Wikipedia. Would there be any interest in this? Is someone else working on it?
PatmaxDaddy
 
Posts: 63
Joined: 18 October 2005

Postby PatmaxDaddy » Sat Dec 03, 2005 5:41 pm

With a few simple improvements my running time is now around 125 ms, and I can see that it could be improved quite a bit further, so I'm confident I must be pretty close to duplicating the method of Kjell and Guenter. Thanks for the tips everyone!
PatmaxDaddy
 
Posts: 63
Joined: 18 October 2005

Postby dukuso » Sat Dec 03, 2005 6:30 pm

Ed's program is here
http://magictour.free.fr/ed.c
so you can compare. I haven't very much examined it,
if yours is clearer or different I'd like to see it
dukuso
 
Posts: 479
Joined: 25 June 2005

Final version

Postby PatmaxDaddy » Mon Dec 05, 2005 5:19 am

A few people have asked to see my version of the Kjell/Guenter method, so I'm going to spend a few days writing comments and then figure out how to make it available. It is currently running in about 55 ms (total for everything, 2Ghz Pentium), and I think that's all the fast it's going to get. It's all C++, no assembly; I'm pretty impressed with what the Microsoft compiler has done with the inner loops. I'll also take a look at Ed's version and see if I can spot any obvious differences.
PatmaxDaddy
 
Posts: 63
Joined: 18 October 2005

Postby bcr666 » Mon Dec 05, 2005 7:56 pm

I've tried working with a 4x4 grid. I wrote a program that calculates all possible values for any single row. The program then tries putting each of these rows onto the grid, then checks the grid for validity (all numbers in a column are unique and all numbers in a box are unique). It turns out that there are 288 (4! * 12) possible grids in a 4x4 that fit the rules for a SuDoku. I hand wrote out all possible combinations for the grid if the first row is 1234. It turns out that there are 12 (in retrospect well Duh!, if there are 4! possible first rows...). I was also looking into the number of unique two digit pairs that can make up any row/block intersection (eg. 12,13,14,21,23,24...) and it turns out that there are 12 (4! / 2! where 4 is the number of different digits and 2 square root of 4). So the equation is (n!)^2 / (n - SQRT(n))!

If we use this equation on a 9x9 grid...
(9!)^2 / (9 - SQRT(9))! =
362880^2 / (9-3)! =
131681894400 / 6! =
131681894400 / 720 =
182891520

I modified the program mentioned earlier to do a 9x9 grid to confirm, but it is going to take a while as there are 9 nested loops counting from 1 to 362880.
bcr666
 
Posts: 2
Joined: 05 December 2005

Postby Red Ed » Mon Dec 05, 2005 10:00 pm

From on high, bcr666 wrote:Read the whole post. I was working with a 4x4 grid.
Ah, so you were. Forgive my lack of careful study. I have consigned my post to the great trash can in the sky.

Since I still seek enlightenment, please tell me: what are you trying to calculate with this method? It can't be the number of 9x9 sudokus, as you'd have noticed by now that there are 6670903752021072936960 of them, which is rather larger than your 182891520 ...
Last edited by Red Ed on Tue Dec 06, 2005 4:23 pm, edited 1 time in total.
Red Ed
 
Posts: 633
Joined: 06 June 2005

Postby bcr666 » Tue Dec 06, 2005 3:22 am

Red Ed wrote:
bcr666 wrote:I was also looking into the number of unique two digit pairs that can make up any row/block intersection (eg. 12,13,14,21,23,24...)
Eh? A row/block intersection consists of 3 adjacent cells (sometimes called a mini-row or boxrow). Where do "unique two digit pairs" come into it?


Read the whole post. I was working with a 4x4 grid.
bcr666
 
Posts: 2
Joined: 05 December 2005

Postby PatmaxDaddy » Tue Dec 06, 2005 4:58 am

I've finally had a chance to read carefully the descriptions by Geunter, KJell, and Ed in this thread and on Wikipedia of the Guenter/Kjell method, and I've also taken a brief look and Ed's code. The version I've coded is definitely equivalent to that described by Guenter on July 7. I can't quite tell if it is also equivalent to the specialization of Guenter described by Kjell on Oct 5. I do generate his list of 5 choices for S4,S5,S6, but I'm not sure if I use them for the same purpose and in quite the same way.

My implementation differs from Ed's in various ways, some more and some less significant. I do not use the 0..83 coding for minicolumns at all, and have no equivalent for his colBits and colcompat arrays. In fact I don't represent minicolumns at all as distinct quantities, only boxes. Each box is represented as either a 27-bit triplet of three 9-bit minicolumns, or as a 0..279 code. I convert from code to triplet by lookup, and from triplet to code by a binary search of a sorted list of triplets.

My 56x56 inner loop is almost identical to Ed's, except that I don't use the nsoln array, I put those values directly in my equivalent of his cmatrix. This avoids a level of indirection. Also, my k is a double instead of an int (by coincidence I use the same variable name), and my equivalent of cmatrix (which holds solution counts and not class id's) is a float instead of an int. Doing these things has a huge effect on speed, because for most modern processors one of the most expensive things one can do is fetch data from memory (even if it hits the data cache).

Avoiding an extra level of indirection is a clear advantage, but the use of floats is subtle and is primarily a issue on Intel architecture (IA) machines. The problem is that IA has a severly limited number of general registers (effectively 6 for complied code). Modern compilers are excellent at using them efficiently, but once a loop exceeds 6 the data it needs spills over to the stack, and if the data is read/write (e.g. k) it's going to cost two memory operations per iteration. The transition from needing 6 to needing 7 items of data is very costly. The CPU, however, has floating point registers that are separate from the general registers, and if k is a float, and the data being fetched from cmatrix are floats, everything the loop needs now fits. Furthermore, the floating point unit runs in parallel with the integer unit. Note that the cost of a floating point multiply is way less than the cost of spilling data onto the stack. (For old guys like me who grew up programming PDP/1's and PDP/8's, this is hard to get used to).

Another difference is that my version is written in C++, and so all the main data structures are wrapped in classes instead of being static arrays. This keeps the code that generates the data proximate to the declaration of the data, and provides a class interface that clarifies how the data are used. Some will find the result easier to understand (assuming I've done a reasonable job writing the code), while others prefer just straight C. This is perhaps a matter of taste, and I certainly do not wish to start a C/C++ debate on this thread.

As I said in an earlier post, I'm going to make the code available somehow as soon as I get a chance to clean it up a little and write some comments.

For the record, here is my current timing data:

Code: Select all
Box cross      2.8    (a table needed to compute the solutions for each gangster)
Permutations   5.7    (permutation tables needed to compute the gangsters)
Band 0         0.5    (a table needed to work with band 1)
Gangsters      7.0    (compute the 44 gangsters)
Gang lists     0.4    (compute lists of compatible boxes for bands 2,3)
Gang counts   10.1    (compute number of solutions for each gangster)
Band counts    4.2    (create lookup table of gang counts)
Grid count    24.0    (count all of the grids)
Overhead       0.1
Total         54.7


Times are in ms using the clock-tick resolution Pentium hardware timer. Note that Windows XP is running in the background, and who knows what effect that has.
PatmaxDaddy
 
Posts: 63
Joined: 18 October 2005

Postby Red Ed » Tue Dec 06, 2005 10:04 pm

PatmaxDaddy wrote:For the record, here is my current timing data:
...
Total 54.7
That's an amazing piece of work and a great example of what a bit of maths + computer science can achieve. I can't see anyone beating the timings you posted. I'm glad I quit whilst I was (briefly!) ahead:)
Red Ed
 
Posts: 633
Joined: 06 June 2005

Postby dukuso » Wed Dec 07, 2005 2:49 am

Red Ed wrote:
PatmaxDaddy wrote:For the record, here is my current timing data:
...
Total 54.7
That's an amazing piece of work and a great example of what a bit of maths + computer science can achieve. I can't see anyone beating the timings you posted. I'm glad I quit whilst I was (briefly!) ahead :)



yours is in C, so I think I still prefer your version.
Enhanced by Patmax' comments and descriptions...
dukuso
 
Posts: 479
Joined: 25 June 2005

Postby gfroyle » Wed Dec 07, 2005 9:59 am

OK, happy campers...

What happens to the numbers when we consider Sudoku X - the version where the main diagonal and anti-diagonal must also contain the digits once each.

The number must drop of course, because there are more restrictions... but this little change also messes up the nice large automorphism group, and so the value may well be harder to calculate...

Comments? Answers??


Cheers

Gordon
gfroyle
 
Posts: 214
Joined: 21 June 2005

Postby dukuso » Wed Dec 07, 2005 1:04 pm

that's not really interesting for a mathematician ?!
The diagonals lack of math.-interpretation, they are just
of aesthetical value for some people.
Same with magic squares.

OK.

A quick estimate gives a chance of 1:100000 that a random
sudoku is X, so there could be 1e17 xudokus.

Start with the middle band, there are 9.5e11 ways, but how many
classes ?

We have our 416 classes of sudoku bands but we must
forbid row,column permutations, so multiply by 6^5
divide by 4 for rectangle-symmetries.
Gives about a million. Each of these has about 1e5 completions.
Say we can count 1e5 in a second, that gives 2 weeks to
count them.

There are probably better ways...
dukuso
 
Posts: 479
Joined: 25 June 2005

PreviousNext

Return to General