How many minimal sudokus has an average grid ?

Everything about Sudoku that doesn't fit in one of the other sections

Postby ravel » Tue Sep 26, 2006 4:05 pm

gsf wrote:what was the full command line where -m1 misbehaved?
./sudoku -m -f'%G%,%v' in.dat > out.dat
My version is 2006-02-14 (linux 386). It reports unknown option 1.
by column offset you mean a one puzzle per line file where the puzzle
is in the Nth space separated column? -cN

Thanks, from the first run with
./sudoku -u -d -qFNB file
i get output lines like
Code: Select all
3/26/55/0015/0 FGCA5IBHD5I4782AF3ABHD6CE974FBCG1IEHC85F9DG1BIAGE2...
so -c15 (or in the magic string) should work in future.
[edit: -c2 works (why ?)]
Last edited by ravel on Tue Sep 26, 2006 12:40 pm, edited 1 time in total.
ravel
 
Posts: 998
Joined: 21 February 2006

Postby gsf » Tue Sep 26, 2006 4:37 pm

ravel wrote:
gsf wrote:what was the full command line where -m1 misbehaved?
./sudoku -m -f'%G%,%v' in.dat > out.dat
My version is 2006-02-14 (linux 386). It reports unknown option 1.

download the latest, it should accept -mN
Thanks, from the first run with
./sudoku -u -d -qFNB file
i get output lines like
Code: Select all
3/26/55/0015/0 FGCA5IBHD5I4782AF3ABHD6CE974FBCG1IEHC85F9DG1BIAGE2...
so -c15 (or in the magic string) should work in future.

the default output changed to list the non-clues as .
also, the -cN count is fields, not character columns
and the defauklt field separator is one or more space chars, so you would want -c2

or you can format the output to just have the info needed
this will list list the puzzle with non-clues as .
Code: Select all
-f%v
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Postby ravel » Tue Sep 26, 2006 4:49 pm

gsf wrote:download the latest, it should accept -mN
Just did it, lets see. [edit: works fine now]
.. so you would want -c2
Thanks, yes, i got that now.
or you can format the output to just have the info needed
this will list list the puzzle with non-clues as .
Code: Select all
-f%v
I tried that, but then (though -u) for the multisolution puzzles a solution was outputted instead of "multiple ..."
[edit:]oops, it does it with the new version also with the above options.
ravel
 
Posts: 998
Joined: 21 February 2006

Postby coloin » Thu Nov 30, 2006 12:55 am

To reopen an old wound, I think ravel mentioned this method earlier. Help to verify my figures and estimations would be appreciated !

If there really are 10^16 puzzles - A "real" puzzle distribution might look like this !
Code: Select all
18    0
19    40
20    2,000
21    500,000
22  ? 10,000,000,000
23  ? 500,000,000,000,000
24  ? 3,000,000,000,000,000
25  ? 3,000,000,000,000,000
26  ? 3,000,000,000,000,000
27  ? 500,000,000,000,000
28  ? 50,000,000,000
29  ? 1,000,000,000
30  ? 100,000,000
31  ? 10,000,000
32  ? 1,000,000
33  ? 10,000
34  ? 1000
35    4
36    0

Lets see if I can estimate the number of 25 clue puzzles in a grid

How many ways are there to put 25 clues at random into a single valid grid
Code: Select all
81! / 56!*25! =  525652003943603702568 = 5.25 * 10^20

how many will be invalid ?

Only a very small proportion will have absent clues in two rows in the same band, and only a small proportion will fail to have at least 8 clue numbers [less than 1% total]

I estimated how often 25 clues will [minimally] solve a given grid by making a random mask and solving it over a large number of "random" grids

Code: Select all
Out of 800,000 puzzles
mask    minimal puzzles
1          18
2           7
3           5

average 10 puzzles,   therefore  = 1 in 80000 

Out of 4,900,000 puzzles
mask    minimal puzzles
4          23

therefore  = 1 in 200,000 


This small study may give an average of 1 in 100000 randomly picked masks of 25 clues will be minimal and valid.

I appreciate that this does not tally with ravels work - and that the effect that is shown here is that some masks are much better than others....

Code: Select all
                                           [5.2 * 10^20] 
Estim number of 25 puzzles in a grid =  ---------------------------------  =     5.1*10^15   
                                                   100000 [average]


only 3/10s of puzzles have 25 clues......
Code: Select all
Estim number of  puzzles in a grid    =     5.2*10^15 *  10/3  =  1.7*10^16


C
Last edited by coloin on Fri Jul 13, 2007 7:54 pm, edited 1 time in total.
coloin
 
Posts: 2515
Joined: 05 May 2005
Location: Devon

Postby Red Ed » Thu Nov 30, 2006 7:32 pm

coloin wrote:If there really are 10^16 puzzles ...
I'm sure you meant to say minimal puzzles there. I wouldn't normally nit-pick (mmm... well...) but it's important to distinguish between the hard problem of estimating the number of minimal puzzles and the easy problem of estimating the number of puzzles in total. For the latter, we have number of puzzles = N * 2^(L^2) * P, where N is the number of solution grids (6.67e21), L is the side length (9) and P is the unknown probability that a random subset of clues in a random grid will constitute a valid puzzle. We can get a rough estimate of P very easily by simulation: it's ~1/2.74 for plain ordinary 3x3 square sudoku. So that gives us 6.67e21 * 2^81 / 2.74 ~= 5.9e45 puzzles. The same trick doesn't work for counting minimal puzzles, though, because the 'P' in that case is too small.
Red Ed
 
Posts: 633
Joined: 06 June 2005

Postby JPF » Fri Dec 01, 2006 3:33 pm

Red Ed wrote:We can get a rough estimate of P very easily by simulation: it's ~1/2.74 for plain ordinary 3x3 square sudoku. So that gives us 6.67e21 * 2^81 / 2.74 ~= 5.9e45 puzzles.

Interesting point.
With 100000 simulations I got E(P) = 0.379 ~ 1/2.64

I don’t know exactly the minimum number of simulations to do in order to have a given precision on P. (the standard deviation should be calculated, etc...)

Have you already made this calculation for one given grid (SF, SFB, dukuoso-x, etc...) to see if there are grids with more puzzles (minimal or not) than others ?

JPF
JPF
2017 Supporter
 
Posts: 6139
Joined: 06 December 2005
Location: Paris, France

Postby ravel » Fri Dec 01, 2006 4:05 pm

JPF wrote:Have you already made this calculation for one given grid (SF, SFB, dukuoso-x, etc...) to see if there are grids with more puzzles (minimal or not) than others ?
I am sure that the number of puzzles is different for different grids. A special case are symmetrical grids (e.g. with r(10-i)c(10-j) = ricj - 10), which must have less valid puzzles.
ravel
 
Posts: 998
Joined: 21 February 2006

Postby Red Ed » Fri Dec 01, 2006 8:17 pm

JPF: I didn't bother calculating confidence intervals or anything for my P ~ 1/2.74 ; I just noted when the running average seemed not to be changing very much. The .1 difference between our two values is probably just down to our generators being biased in different ways.

Trying the same technique on fixed grids (good idea, JPF) should remove that bias. So I did that and, sure enough, some grids yield far more puzzles than others:
  • SFB : 1 / 1.45
  • RV : 1 / 1.51
  • SF : 1 / 1.64
  • MC : 1 / 4.5
  • Pt : 1 / 6
  • dukuso16 : 1 / 6.5
If your stats are wildly different on these grids then one of us has made a mistake. (Could easily be me ... I'm not using the most obvious method of estimating P.)
Red Ed
 
Posts: 633
Joined: 06 June 2005

Postby coloin » Fri Dec 01, 2006 9:30 pm

Red Ed wrote:So that gives us 6.67e21 * 2^81 / 2.74 ~= 5.9e45 puzzles.

Thanks for that calculation.....a similarish number appeared from dukuso but he didnt say how he got it.
dukuso wrote:sudokugrids:1e22 from 1e10 classes
sudokus:1e47 from 1e35 classes (1-solution)
minimal sudokus:1e37 from 1e25 classes (no superfluous clues)


Interesting on the non-minimal statistics....proves what I always suspected.........

Out of interest please could you test these two grids...........

Code: Select all
Ran11      592148367734269158168357249321975486645821793879634512213486975456793821987512634

and this one
Code: Select all
20-17s     594231786786945132123768954965173248378492561241856397432619875619587423857324619


Ran11 - I hope its not extaordinary [because I cant recall where it came from !] .......apart from finding over 660 35 clue minimal puzzles !

20-17s - this grid is the 2nd most fertile - 20 17s.

SFB grid - I eventually found over 50 35s, none found in the RV

C
coloin
 
Posts: 2515
Joined: 05 May 2005
Location: Devon

Postby JPF » Fri Dec 01, 2006 9:45 pm

Red Ed wrote:If your stats are wildly different on these grids then one of us has made a mistake. (Could easily be me ... I'm not using the most obvious method of estimating P.)

Here are my results (10000 simulations) :
Code: Select all
SFB
589732614621854379743916825835129746417568293296347158968471532372695481154283967
P = 0.694 = 1/1.44

Code: Select all
SF
562389471849716253137425896358194627974263185216857349691542738725638914483971562
P = 0.610 = 1/1.64

Code: Select all
Pt
123456789457189326689327154231645897745891632896732541318264975574918263962573418
P = 0.168 = 1/5.95

Code: Select all
dukuso16
145726983837495261926381574293874156581269347674153892318547629459632718762918435
P= 0.164 = 1/6.1

Code: Select all
Ran11
592148367734269158168357249321975486645821793879634512213486975456793821987512634
P = 0.171 = 1/5.85

Code: Select all
20-17s
594231786786945132123768954965173248378492561241856397432619875619587423857324619
P = 0.442 = 1/2.26

Seems Ok, but with remaining divergence on dukuso 16…
What is the code of RV, MC ?

JPF
JPF
2017 Supporter
 
Posts: 6139
Joined: 06 December 2005
Location: Paris, France

Postby Red Ed » Fri Dec 01, 2006 11:13 pm

MC and RV are as follows:
Code: Select all
RV: 123456789456789123789231564234178956695342817817695342368527491572914638941863275
MC: 123456789456789123789123456231564897564897231897231564312645978645978312978312645

It's not surprising that we diverge with dukuso16. When written as 1/(blah), the estimate converges really rather slowly. Having run some more trials, it now looks like ~1/6.25 to me. But I think the main point is: we're getting basically the same answers as each other here.
Red Ed
 
Posts: 633
Joined: 06 June 2005

Postby JPF » Sat Dec 02, 2006 12:32 am

Thanks.

With 10000 simulations (puzzles) on RV and MC :
Code: Select all
RV
123456789456789123789231564234178956695342817817695342368527491572914638941863275
P = 0.665 = 1/1.50 (1/1.51 for you)

Code: Select all
MC
123456789456789123789123456231564897564897231897231564312645978645978312978312645
P = 0.227 = 1/4.41 (against 1/4.5 for you)

With 100000 simulations on dukuso16 :
P = 0.1587 = 1/6.29, which is close to your new proposal (1/6.25)

It looks like our numbers are statistically the same for each of these grids.

Concerning P for all the grids, I will make a 10^6 simulations overnight.

JPF
JPF
2017 Supporter
 
Posts: 6139
Joined: 06 December 2005
Location: Paris, France

Postby JPF » Sat Dec 02, 2006 9:01 am

Average P for all the grids
After generation of 10^6 puzzles, I got roughly the same numbers than before : P = 0.378 = 1/2.64

Here’s the distribution of the average number of solutions for all the puzzles
Code: Select all
 Sol.      (%)   
                                                               
  1       37.8                                                                 
  2       24.1                                                                 
  3        9.0                                                                 
  4        7.6                                                                 
  5        2.9                                                                 
  6        4.0                                                                 
  7        1.3                                                                 
  8        2.0                                                                 
  9        1.1                                                                 
10+       10.2                                                                 
                                                                               
         100.0                                                                 

[Edit : typo]
JPF
Last edited by JPF on Sun Dec 03, 2006 6:34 pm, edited 1 time in total.
JPF
2017 Supporter
 
Posts: 6139
Joined: 06 December 2005
Location: Paris, France

Postby coloin » Sat Dec 02, 2006 2:50 pm

Excellent work here.......

Does this mean that in your search 37.8 % of all puzzles were solved ?

Wont this depend on how many clues you have in your random subset ? Or have I missed that ?

C
coloin
 
Posts: 2515
Joined: 05 May 2005
Location: Devon

Postby JPF » Sat Dec 02, 2006 3:57 pm

There are 6.67e21 different grids, which give N = 2^81 x 6.67e21 = 1.61e46 different puzzles with at least one solution.

Let's call Nk (k≥1) the number of puzzles with k solutions.

What I did was to get an "estimation" of Pk = Nk/N.

When k=1, we are dealing with the valid puzzles.
P1 = N1/N was estimated by Red Ed at 36.5% and at 37.8% by me.

It means that there are roughly 0.37N = 0.37 x1.61e46 = 5.9e45 valid puzzles.

With the same concept we can have an estimation of Nk with the numbers Pk given in my previous post.
For example, as P2 = N2/N ~ 0.24, there are N2 = 0.24N = 3.9e45 puzzles with 2 solutions.

etc...

JPF
JPF
2017 Supporter
 
Posts: 6139
Joined: 06 December 2005
Location: Paris, France

PreviousNext

Return to General