Suppose we start with a given completed sudoku grid.

Then randomly delete some numbers.

What are the chances that this puzzle has a unique solution?

Has anyone looked at this before. or done any computational experiments?

15 posts
• Page **1** of **1**

Suppose we start with a given completed sudoku grid.

Then randomly delete some numbers.

What are the chances that this puzzle has a unique solution?

Has anyone looked at this before. or done any computational experiments?

Then randomly delete some numbers.

What are the chances that this puzzle has a unique solution?

Has anyone looked at this before. or done any computational experiments?

- Moschopulus
**Posts:**256**Joined:**16 July 2005

The probability will depend on the starting grid.

I doubt that anyone has looked at this, because it's too difficult both analytically and by Monte Carlo.

I doubt that anyone has looked at this, because it's too difficult both analytically and by Monte Carlo.

- r.e.s.
**Posts:**337**Joined:**31 August 2005

Even for one grid (chosen at random!) I would like to have some idea.

Choose a number k at random, then choose k cells at random, and delete the digits from those cells. See if the resulting puzzle has a unique solution. Do this 1,000,000 times or whatever is reasonable. How many will have a unique solution?

How do you know the answer will depend on the grid?

Choose a number k at random, then choose k cells at random, and delete the digits from those cells. See if the resulting puzzle has a unique solution. Do this 1,000,000 times or whatever is reasonable. How many will have a unique solution?

How do you know the answer will depend on the grid?

- Moschopulus
**Posts:**256**Joined:**16 July 2005

Moschopulus wrote:Even for one grid (chosen at random!) I would like to have some idea.

Choose a number k at random, then choose k cells at random, and delete the digits from those cells. See if the resulting puzzle has a unique solution. Do this 1,000,000 times or whatever is reasonable. How many will have a unique solution?

How do you know the answer will depend on the grid?

about 60 for k=25.

about 40% for any k (random subset), 2^81 possibilities

so, there are about 1e24 sudokus per grid and

about 1e17 locally minimal sudokus per grid

(very rough estimate, could be 1e18 or 1e16)

that makes 1e46 sudokus in total.

or 1e27 locally minimal, non-equivalent ones.

- dukuso
**Posts:**479**Joined:**25 June 2005

Moschopulus wrote:Even for one grid (chosen at random!) I would like to have some idea.

For that to be well-defined, you'll need to specify exactly what is meant by a "grid chosen at random". Presumably you mean to make all valid sudoku solution grids equally likely; if so, exactly what is the procedure for choosing one?

Moschopulus wrote:Choose a number k at random, then choose k cells at random, and delete the digits from those cells.

What exactly is meant by "choose a number k at random"? If k is chosen uniformly at random from {k_min, ..., k_max}, what are k_min, k_max?

Moschopulus wrote:See if the resulting puzzle has a unique solution. Do this 1,000,000 times or whatever is reasonable. How many will have a unique solution?

Let g be a grid chosen uniformly at random from the set of all sudoku solution grids.

Let k be chosen uniformly at random from the set {k_min, ..., k_max} for some k_min, k_max.

Let g_k be the puzzle that results by deleting the digits from k cells chosen uniformly at random from those of g.

Let U be the set of sudoku puzzles that have a unique solution.

Let S be the set of sudoku puzzles.

Let #U be the number of puzzles in U.

Let #S be the number of puzzles in S.

Let P(k_min, k_max) = pr{g_k in U}.

I'm not sure, but it seems you're claiming that P(k_min, k_max) = #U/#S for some k_min, k_max. Is there proof of that, and if so, for which k_min, k_max?

Moschopulus wrote:How do you know the answer will depend on the grid?

I should have said "Have you shown that the answer won't depend on the grid?"

- r.e.s.
**Posts:**337**Joined:**31 August 2005

I suppose sensible minimum and maximum values for K would be 4 (because 4 is the least number of numbers you can delete where there's any chance at all of getting multiple solutions) and 64 (because if you detete any more than that, there'll only be 16 clues left, so there'd never be a unique solution.

- PaulIQ164
**Posts:**533**Joined:**16 July 2005

PaulIQ164 wrote:Well, I don't think it's been rigorously proven, but as I understood it it's been tested empirically enough for it to be a pretty safe assertion.

We have tried hard to find a 16, but the search space is so astronomically huge that I would not put too much faith in the empirical testing so far.

I have 25000 17s now, and although finding them is getting harder (because I keep finding copies of the existing ones) there is no indication at all that I have found all, or even most, of them.

So, although you may be correct, I personally still think that a 16 is probably lurking there somewhere..

Of course, if the answer is "16" then it is unlikely that we will ever know that in our lifetime...

Gordon

- gfroyle
**Posts:**214**Joined:**21 June 2005

gfroyle wrote:[ I personally still think that a 16 is probably lurking there somewhere..

**then please comment on this:

------------------------------------------

I think I'll give up searching for a 16-clue-sudoku.

I examined Gordon's list and this looks rather

discouraging.

I.e. when you remove one of the 17 clues and count the number

of solutions, then you get 10-20 counts for each of the smaller

values, except 1,3 and 5 !

That looks as if there is some theoretical obstacle which hinders

him from finding these.

Guenter.

number of solutioncounts c for 16-clue-puzzles

generated by deleting a clue from

one of Gordon's 17-sudokus for c=1,2,..98 :

0 18 0 43 0 33 9 26 0 36

2 38 24 23 18 39 40 28 11 62

25 32 8 27 25 47 16 25 19 34

12 35 27 38 27 34 16 46 19 42

23 29 21 55 25 58 22 38 18 28

16 48 50 48 22 34 29 40 57 30

26 43 36 44 29 25 30 35 19 42

28 45 24 21 10 43 38 51 26 16

36 35 23 33 24 29 19 36 36 49

22 46 27 54 35 38 27 38

------------------------------------------------

Of course, if the answer is "16" then it is unlikely that we will ever know that in our lifetime...

Gordon

I had an estimate of 500000 17s AFAIR. And if there is a 16, then

you can make 65 17s from it - just by adding one clue.

But you didn't yet find any of these 65.

Maybe in some years you get one of these cheap new 64-processor

machines....

- dukuso
**Posts:**479**Joined:**25 June 2005

Wolfgang wrote:Moschopulus wrote:How do you know the answer will depend on the grid?

Its very probable that you find more unique sudokus in the "very symmetric" grid than in others.

the average number of clues in a locally minimal sudoku is a bit

higher in our canonical = very symmetric grid (24.4 vs. 24.0).

So there could be more (locally) minimal sudokus in such a grid.

But then any superset of a minimal one is also a sudoku

and a set with more members has fewer supersets,

so it's not yet clear to me whether there are more

sudokus in such a grid.

I won't expect a large difference.

-Guenter

- dukuso
**Posts:**479**Joined:**25 June 2005

dukuso wrote:I had an estimate of 500000 17s AFAIR. And if there is a 16, then

you can make 65 17s from it - just by adding one clue.

But you didn't yet find any of these 65.

Maybe in some years you get one of these cheap new 64-processor

machines....

I recall your estimate, but was unsure as to how you obtained it - if you used any data from my list of 17s, then I would be very careful about it, because my construction techniques (perturbing existing 17s to get new ones) will ensure a much higher correlation, or similarity, between them than a pure random approach.

Of course, you are right that I only need to stumble across ANY of the 65 extensions of any 16 to find it.. and of course, if there is a 15, then there would be a much bigger collection (66 choose 2) of 17s that I could hit.

What I am concerned about is that I am exploring just one "neighbourhood" of the search space, and that there might be an enormous collection of 17s that are somehow "a long way" from my particular collection. To use an analogy, it's a bit like mining one vein of gold, while there is another vein 50 metres away that you never come across.

What would be interesting is if there was some way of generating even a smallish number of 17s using entirely different techniques and then seeing how many of them I already have... but as you have already noted, it is actually rather difficult even to find 18s from scratch.

I still think there is probably a 16, because it feels entirely unlikely to me that there would be so many "close" ones with a handful of solutions without there being one with a unique solution. Possible of course, so I don't want to be dogmatic, but somehow I feel it is unlikely.

In the meantime, I guess I'll just keep collecting 17s in the background, and see if I can get close to your 500000 estimate!

gordon

- gfroyle
**Posts:**214**Joined:**21 June 2005

>dukuso wrote:

>

>>I had an estimate of 500000 17s AFAIR. And if there is a 16, then

>>you can make 65 17s from it - just by adding one clue.

>>But you didn't yet find any of these 65.

>

>I recall your estimate, but was unsure as to how you obtained it

> - if you used any data from my list of 17s, then I would be very

> careful about it, because my construction techniques (perturbing

> existing 17s to get new ones) will ensure a much higher correlation,

>or similarity, between them than a pure random approach.

yes, using statistics on your 17s.

When they are biased, then why do they cover the classes

so uniformly as in Mosch.'s picture ?

Also, you said you had only 5million 18s, that's some hundred

18s per 17. I'd assume a similar ratio for 17s/16s, if there were any.

My random sample gives ratios

24s/23s : 2.01

23s/22s : 5.05

22s/21s : 13.6

21s/20s : 42

and 60 20s per 1e6 locally minimal sudokus.

With 1e24 nonequivalent l.m. sudokus in total and estimating

20s/19s = 100

19s/18s = 200

18s/17s = 400

>Of course, you are right that I only need to stumble across

>ANY of the 65 extensions of any 16 to find it.. and of course,

>if there is a 15, then there would be a much bigger collection

>(66 choose 2) of 17s that I could hit.

>

>What I am concerned about is that I am exploring just one

>"neighbourhood" of the search space, and that there might

>be an enormous collection of 17s that are somehow "a long

>way" from my particular collection. To use an analogy,

>it's a bit like mining one vein of gold, while there is

>another vein 50 metres away that you never come across.

you sent in your nuggets and to me it looks like they were washed

from river-sand from many different places rather than

broken from a mine.

>What would be interesting is if there was some way of

>generating even a smallish number of 17s using entirely

>different techniques

what's different ? We don't know your technique

>and then seeing how many of them I already have...

>but as you have already noted, it is actually rather difficult

>even to find 18s from scratch.

methods need to be improved

>I still think there is probably a 16, because it feels entirely

>unlikely to me that there would be so many "close" ones with a

>handful of solutions without there being one with a unique solution.

only for the even number of solutions, not the odd ones !

Se my posted statistics.

>Possible of course, so I don't want to be dogmatic, but somehow

>I feel it is unlikely.

>

>In the meantime, I guess I'll just keep collecting 17s in the

>background, and see if I can get close to your 500000 estimate!

>

>gordon

good, if you can run this in background.

They will be interesting even if there is no 16 among them.

Can you also upload a randomized sample of your 18s ?

Some thousand should be enough. How many 18s do you

get per one 17 in average ?

Guenter

-Guenter

>

>>I had an estimate of 500000 17s AFAIR. And if there is a 16, then

>>you can make 65 17s from it - just by adding one clue.

>>But you didn't yet find any of these 65.

>

>I recall your estimate, but was unsure as to how you obtained it

> - if you used any data from my list of 17s, then I would be very

> careful about it, because my construction techniques (perturbing

> existing 17s to get new ones) will ensure a much higher correlation,

>or similarity, between them than a pure random approach.

yes, using statistics on your 17s.

When they are biased, then why do they cover the classes

so uniformly as in Mosch.'s picture ?

Also, you said you had only 5million 18s, that's some hundred

18s per 17. I'd assume a similar ratio for 17s/16s, if there were any.

My random sample gives ratios

24s/23s : 2.01

23s/22s : 5.05

22s/21s : 13.6

21s/20s : 42

and 60 20s per 1e6 locally minimal sudokus.

With 1e24 nonequivalent l.m. sudokus in total and estimating

20s/19s = 100

19s/18s = 200

18s/17s = 400

>Of course, you are right that I only need to stumble across

>ANY of the 65 extensions of any 16 to find it.. and of course,

>if there is a 15, then there would be a much bigger collection

>(66 choose 2) of 17s that I could hit.

>

>What I am concerned about is that I am exploring just one

>"neighbourhood" of the search space, and that there might

>be an enormous collection of 17s that are somehow "a long

>way" from my particular collection. To use an analogy,

>it's a bit like mining one vein of gold, while there is

>another vein 50 metres away that you never come across.

you sent in your nuggets and to me it looks like they were washed

from river-sand from many different places rather than

broken from a mine.

>What would be interesting is if there was some way of

>generating even a smallish number of 17s using entirely

>different techniques

what's different ? We don't know your technique

>and then seeing how many of them I already have...

>but as you have already noted, it is actually rather difficult

>even to find 18s from scratch.

methods need to be improved

>I still think there is probably a 16, because it feels entirely

>unlikely to me that there would be so many "close" ones with a

>handful of solutions without there being one with a unique solution.

only for the even number of solutions, not the odd ones !

Se my posted statistics.

>Possible of course, so I don't want to be dogmatic, but somehow

>I feel it is unlikely.

>

>In the meantime, I guess I'll just keep collecting 17s in the

>background, and see if I can get close to your 500000 estimate!

>

>gordon

good, if you can run this in background.

They will be interesting even if there is no 16 among them.

Can you also upload a randomized sample of your 18s ?

Some thousand should be enough. How many 18s do you

get per one 17 in average ?

Guenter

-Guenter

- dukuso
**Posts:**479**Joined:**25 June 2005

15 posts
• Page **1** of **1**