Chances of unique solution

Everything about Sudoku that doesn't fit in one of the other sections

Chances of unique solution

Postby Moschopulus » Tue Sep 06, 2005 11:36 pm

Suppose we start with a given completed sudoku grid.
Then randomly delete some numbers.
What are the chances that this puzzle has a unique solution?

Has anyone looked at this before. or done any computational experiments?
Moschopulus
 
Posts: 256
Joined: 16 July 2005

Re: Chances of unique solution

Postby r.e.s. » Wed Sep 07, 2005 12:37 am

The probability will depend on the starting grid.
I doubt that anyone has looked at this, because it's too difficult both analytically and by Monte Carlo.
r.e.s.
 
Posts: 337
Joined: 31 August 2005

Postby Moschopulus » Wed Sep 07, 2005 8:49 am

Even for one grid (chosen at random!) I would like to have some idea.

Choose a number k at random, then choose k cells at random, and delete the digits from those cells. See if the resulting puzzle has a unique solution. Do this 1,000,000 times or whatever is reasonable. How many will have a unique solution?

How do you know the answer will depend on the grid?
Moschopulus
 
Posts: 256
Joined: 16 July 2005

Postby dukuso » Wed Sep 07, 2005 10:59 am

Moschopulus wrote:Even for one grid (chosen at random!) I would like to have some idea.

Choose a number k at random, then choose k cells at random, and delete the digits from those cells. See if the resulting puzzle has a unique solution. Do this 1,000,000 times or whatever is reasonable. How many will have a unique solution?

How do you know the answer will depend on the grid?


about 60 for k=25.
about 40% for any k (random subset), 2^81 possibilities

so, there are about 1e24 sudokus per grid and
about 1e17 locally minimal sudokus per grid
(very rough estimate, could be 1e18 or 1e16)

that makes 1e46 sudokus in total.
or 1e27 locally minimal, non-equivalent ones.
dukuso
 
Posts: 479
Joined: 25 June 2005

Postby r.e.s. » Wed Sep 07, 2005 5:20 pm

Moschopulus wrote:Even for one grid (chosen at random!) I would like to have some idea.

For that to be well-defined, you'll need to specify exactly what is meant by a "grid chosen at random". Presumably you mean to make all valid sudoku solution grids equally likely; if so, exactly what is the procedure for choosing one?

Moschopulus wrote:Choose a number k at random, then choose k cells at random, and delete the digits from those cells.

What exactly is meant by "choose a number k at random"? If k is chosen uniformly at random from {k_min, ..., k_max}, what are k_min, k_max?

Moschopulus wrote:See if the resulting puzzle has a unique solution. Do this 1,000,000 times or whatever is reasonable. How many will have a unique solution?

Let g be a grid chosen uniformly at random from the set of all sudoku solution grids.
Let k be chosen uniformly at random from the set {k_min, ..., k_max} for some k_min, k_max.
Let g_k be the puzzle that results by deleting the digits from k cells chosen uniformly at random from those of g.
Let U be the set of sudoku puzzles that have a unique solution.
Let S be the set of sudoku puzzles.
Let #U be the number of puzzles in U.
Let #S be the number of puzzles in S.
Let P(k_min, k_max) = pr{g_k in U}.

I'm not sure, but it seems you're claiming that P(k_min, k_max) = #U/#S for some k_min, k_max. Is there proof of that, and if so, for which k_min, k_max?

Moschopulus wrote:How do you know the answer will depend on the grid?

I should have said "Have you shown that the answer won't depend on the grid?"
r.e.s.
 
Posts: 337
Joined: 31 August 2005

Postby PaulIQ164 » Wed Sep 07, 2005 5:46 pm

I suppose sensible minimum and maximum values for K would be 4 (because 4 is the least number of numbers you can delete where there's any chance at all of getting multiple solutions) and 64 (because if you detete any more than that, there'll only be 16 clues left, so there'd never be a unique solution.
PaulIQ164
 
Posts: 533
Joined: 16 July 2005

Postby r.e.s. » Wed Sep 07, 2005 6:08 pm

PaulIQ164 wrote:if you detete any more than that, there'll only be 16 clues left, so there'd never be a unique solution.

I thought it was only speculation (so far) that 17 is the minimum number of hints among puzzles with a unique solution. (?)
r.e.s.
 
Posts: 337
Joined: 31 August 2005

Postby PaulIQ164 » Wed Sep 07, 2005 7:41 pm

Well, I don't think it's been rigorously proven, but as I understood it it's been tested empirically enough for it to be a pretty safe assertion.
PaulIQ164
 
Posts: 533
Joined: 16 July 2005

Postby gfroyle » Thu Sep 08, 2005 12:17 am

PaulIQ164 wrote:Well, I don't think it's been rigorously proven, but as I understood it it's been tested empirically enough for it to be a pretty safe assertion.


We have tried hard to find a 16, but the search space is so astronomically huge that I would not put too much faith in the empirical testing so far.

I have 25000 17s now, and although finding them is getting harder (because I keep finding copies of the existing ones) there is no indication at all that I have found all, or even most, of them.

So, although you may be correct, I personally still think that a 16 is probably lurking there somewhere..

Of course, if the answer is "16" then it is unlikely that we will ever know that in our lifetime...

Gordon
gfroyle
 
Posts: 214
Joined: 21 June 2005

Postby dukuso » Thu Sep 08, 2005 7:05 am

gfroyle wrote:[ I personally still think that a 16 is probably lurking there somewhere..



**then please comment on this:
------------------------------------------
I think I'll give up searching for a 16-clue-sudoku.
I examined Gordon's list and this looks rather
discouraging.
I.e. when you remove one of the 17 clues and count the number
of solutions, then you get 10-20 counts for each of the smaller
values, except 1,3 and 5 !
That looks as if there is some theoretical obstacle which hinders
him from finding these.


Guenter.



number of solutioncounts c for 16-clue-puzzles
generated by deleting a clue from
one of Gordon's 17-sudokus for c=1,2,..98 :


0 18 0 43 0 33 9 26 0 36
2 38 24 23 18 39 40 28 11 62
25 32 8 27 25 47 16 25 19 34
12 35 27 38 27 34 16 46 19 42
23 29 21 55 25 58 22 38 18 28
16 48 50 48 22 34 29 40 57 30
26 43 36 44 29 25 30 35 19 42
28 45 24 21 10 43 38 51 26 16
36 35 23 33 24 29 19 36 36 49
22 46 27 54 35 38 27 38

------------------------------------------------



Of course, if the answer is "16" then it is unlikely that we will ever know that in our lifetime...

Gordon



I had an estimate of 500000 17s AFAIR. And if there is a 16, then
you can make 65 17s from it - just by adding one clue.
But you didn't yet find any of these 65.

Maybe in some years you get one of these cheap new 64-processor
machines....
dukuso
 
Posts: 479
Joined: 25 June 2005

Postby Wolfgang » Thu Sep 08, 2005 7:27 am

Moschopulus wrote:How do you know the answer will depend on the grid?

Its very probable that you find more unique sudokus in the "very symmetric" grid than in others.
Wolfgang
 
Posts: 208
Joined: 22 June 2005

Postby dukuso » Thu Sep 08, 2005 7:35 am

Wolfgang wrote:
Moschopulus wrote:How do you know the answer will depend on the grid?

Its very probable that you find more unique sudokus in the "very symmetric" grid than in others.


the average number of clues in a locally minimal sudoku is a bit
higher in our canonical = very symmetric grid (24.4 vs. 24.0).
So there could be more (locally) minimal sudokus in such a grid.
But then any superset of a minimal one is also a sudoku
and a set with more members has fewer supersets,
so it's not yet clear to me whether there are more
sudokus in such a grid.
I won't expect a large difference.


-Guenter
dukuso
 
Posts: 479
Joined: 25 June 2005

Postby Wolfgang » Thu Sep 08, 2005 8:50 am

Right, yes, so also the opposite could be the case. Fact is for me that there will be a difference for a given k.
Wolfgang
 
Posts: 208
Joined: 22 June 2005

Postby gfroyle » Fri Sep 09, 2005 11:27 am

dukuso wrote:I had an estimate of 500000 17s AFAIR. And if there is a 16, then
you can make 65 17s from it - just by adding one clue.
But you didn't yet find any of these 65.

Maybe in some years you get one of these cheap new 64-processor
machines....


I recall your estimate, but was unsure as to how you obtained it - if you used any data from my list of 17s, then I would be very careful about it, because my construction techniques (perturbing existing 17s to get new ones) will ensure a much higher correlation, or similarity, between them than a pure random approach.

Of course, you are right that I only need to stumble across ANY of the 65 extensions of any 16 to find it.. and of course, if there is a 15, then there would be a much bigger collection (66 choose 2) of 17s that I could hit.

What I am concerned about is that I am exploring just one "neighbourhood" of the search space, and that there might be an enormous collection of 17s that are somehow "a long way" from my particular collection. To use an analogy, it's a bit like mining one vein of gold, while there is another vein 50 metres away that you never come across.

What would be interesting is if there was some way of generating even a smallish number of 17s using entirely different techniques and then seeing how many of them I already have... but as you have already noted, it is actually rather difficult even to find 18s from scratch.

I still think there is probably a 16, because it feels entirely unlikely to me that there would be so many "close" ones with a handful of solutions without there being one with a unique solution. Possible of course, so I don't want to be dogmatic, but somehow I feel it is unlikely.

In the meantime, I guess I'll just keep collecting 17s in the background, and see if I can get close to your 500000 estimate!

gordon
gfroyle
 
Posts: 214
Joined: 21 June 2005

Postby dukuso » Sat Sep 10, 2005 11:48 am

>dukuso wrote:
>
>>I had an estimate of 500000 17s AFAIR. And if there is a 16, then
>>you can make 65 17s from it - just by adding one clue.
>>But you didn't yet find any of these 65.
>
>I recall your estimate, but was unsure as to how you obtained it
> - if you used any data from my list of 17s, then I would be very
> careful about it, because my construction techniques (perturbing
> existing 17s to get new ones) will ensure a much higher correlation,
>or similarity, between them than a pure random approach.

yes, using statistics on your 17s.
When they are biased, then why do they cover the classes
so uniformly as in Mosch.'s picture ?
Also, you said you had only 5million 18s, that's some hundred
18s per 17. I'd assume a similar ratio for 17s/16s, if there were any.
My random sample gives ratios
24s/23s : 2.01
23s/22s : 5.05
22s/21s : 13.6
21s/20s : 42

and 60 20s per 1e6 locally minimal sudokus.
With 1e24 nonequivalent l.m. sudokus in total and estimating
20s/19s = 100
19s/18s = 200
18s/17s = 400


>Of course, you are right that I only need to stumble across
>ANY of the 65 extensions of any 16 to find it.. and of course,
>if there is a 15, then there would be a much bigger collection
>(66 choose 2) of 17s that I could hit.
>
>What I am concerned about is that I am exploring just one
>"neighbourhood" of the search space, and that there might
>be an enormous collection of 17s that are somehow "a long
>way" from my particular collection. To use an analogy,
>it's a bit like mining one vein of gold, while there is
>another vein 50 metres away that you never come across.

you sent in your nuggets and to me it looks like they were washed
from river-sand from many different places rather than
broken from a mine.

>What would be interesting is if there was some way of
>generating even a smallish number of 17s using entirely
>different techniques

what's different ? We don't know your technique

>and then seeing how many of them I already have...
>but as you have already noted, it is actually rather difficult
>even to find 18s from scratch.

methods need to be improved

>I still think there is probably a 16, because it feels entirely
>unlikely to me that there would be so many "close" ones with a
>handful of solutions without there being one with a unique solution.

only for the even number of solutions, not the odd ones !
Se my posted statistics.

>Possible of course, so I don't want to be dogmatic, but somehow
>I feel it is unlikely.
>
>In the meantime, I guess I'll just keep collecting 17s in the
>background, and see if I can get close to your 500000 estimate!
>
>gordon

good, if you can run this in background.
They will be interesting even if there is no 16 among them.
Can you also upload a randomized sample of your 18s ?
Some thousand should be enough. How many 18s do you
get per one 17 in average ?

Guenter


-Guenter
dukuso
 
Posts: 479
Joined: 25 June 2005


Return to General