Minimum number of clues

Everything about Sudoku that doesn't fit in one of the other sections

Postby gfroyle » Sat Oct 22, 2005 11:38 am

I like mandatory. The word is rare enough here so not to cause
accidental confusion


Aaarrrghh... no...

I vote to stay with "unavoidable" - it is actually the RIGHT word, in the sense that the normal English usage is precisely what it means in Sudoku; namely in a valid Sudoku grid, that set of cells cannot be avoided, or is unavoidable.

The word "mandatory" does not capture the right concept. If we say that a set is "mandatory", then does that mean that we must include ALL the elements of the set, or just one? In my opinion, the normal English usage is that mandatory implies ALL of the items.

On the other hand, unavoidable is totally unambiguous and absolutely precise...

Just my 2c ...

Gordon
gfroyle
 
Posts: 214
Joined: 21 June 2005

Postby DanO » Sat Oct 22, 2005 11:01 pm

Were dealing with terms that have been around for over 100 years. There is no need to invent new names for them here.
DanO
 
Posts: 40
Joined: 18 October 2005

Postby dukuso » Sun Oct 23, 2005 4:51 am

OK, I wrote the program to find (minimal) unavoidables in chutes
and 3-rookeries:
http://magictour.free.fr/unav27.exe
C-source-code attached to the executable as usual.

I'm not sure, whether it works correctly.

Found 687 different such unavoidables in sf:
http://magictour.free.fr/sfua

now we just need the program to find the 17-sets which hit
all the 687---
A set of cells which hits any unavoidable must be a valid
1-solution sudoku.

-Guenter

edit: number corrected
Last edited by dukuso on Mon Oct 24, 2005 12:13 pm, edited 1 time in total.
dukuso
 
Posts: 479
Joined: 25 June 2005

Postby Moschopulus » Sun Oct 23, 2005 3:32 pm

dukuso wrote:A set of cells which hits any unavoidable must be a valid
1-solution sudoku.


Yes, if you mean *every* unavoidable, not just the 649 that you found.

Thanks for the program! Other program coming soon....
Moschopulus
 
Posts: 256
Joined: 16 July 2005

Postby dukuso » Mon Oct 24, 2005 8:29 am

deleted because of errors
Last edited by dukuso on Mon Oct 24, 2005 12:19 pm, edited 1 time in total.
dukuso
 
Posts: 479
Joined: 25 June 2005

Unavoidables

Postby coloin » Mon Oct 24, 2005 12:00 pm

Getting these unavoidables sorted is definitly the way forward.

When you have a puzzle and you deduce by logic that another clue can be inserted......does it mean that this clue is in an unavoidable set which has already been covered - but not in one which hasnt ?

Also bearing in mind that the SF has 29 solutions with a backbone of 14 or so clues - is it not surprizing that these clues arnt more spectacularly reflected in the random generation of puzzles with 21 or less clues. This must mean there are more 17s in the SF.....?

C
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

Postby Wolfgang » Mon Oct 24, 2005 3:41 pm

Though the program is still in construction and not fully automised, my approach for searching for low clue sudokus in given grids seems to be rather effective now, but it is very slow, ie needs days for one grid (on my notebook).
I tested it with three of the known 17-clue grids (sf, one of the twin grids and now the grid without unavoidables for 2 or 3 different digits and 3 known 17-clues, which - forgive me - i call NU23 now). In all of the grids i did find (known) 17-clues. Whereas in the first 2 grids i already found one in the region of the 2nd 19-clue that i detected directly with weighted random clue sets, i needed about 20 19-clues, until i had the region of the 17-clues in NU23. With a region i mean, that 2 sudokus belong to the same region, when they have say 14 clues in common.
In ssf (similar to strangely familiar) i only found 16 18-clues in two regions. But the probability that the program does not find an 18 clue is higher than for a 17 clue, because normally there are much less 19- and 20- clues around it.
A 16 clue should be very attractive for my method with all the 17-,18- and 19-clues around. So it is unlikely for me that one of the grids would contain one.

My current impression is the following:
- there are less grids containing a 17-clue than i thought
- it is unlikely that a grid has two 17-clues in different regions (are there any in the known grids with multiple 17-clues?)
- then it does not make much sense to search in known 17-clues grids
- we cannot say if a grid is promising without making a rather deep search. Criteria like the number of unavoidables, unique pairs, random clue set results, minimum number of clues for unique 3-rookeries are not more than a hint. There are grids with poor such properties that contain 17-clues and others that look much better that dont seem to have one.
Wolfgang
 
Posts: 208
Joined: 22 June 2005

Postby coloin » Mon Oct 24, 2005 8:09 pm

It remains to be seen Wolfgang but I agree with a lot of your impressions.

dukuso wrote:
counts of occupied cells for randomly generated
minimal sudokus of size <22 over sf :

1640 1743 1327 1519 1686 1795 1645 1656 1503
1388 1663 1569 1517 1439 1477 1508 1507 1377
1503 1925 1460 1510 1489 1816 1621 1338 1264
1559 1424 1438 1565 1720 1406 1589 1618 1548
1781 1790 1491 1461 1371 1362 1404 1429 1420
1546 1463 1452 1665 1605 1540 1746 1697 1575
1614 1445 1457 1620 1555 1389 1523 1515 1456
1613 1227 1419 1551 1552 1472 1510 1580 1634
1831 1777 1369 1543 1527 1487 1496 1677 1643


Wolfgang's rookery numbers:
40 36 23 | 53 88 56 | 36 61 71
22 71 40 | 48 61 39 | 48 52 29
41 134 42| 63 49 80 | 68 20 22
51 34 47 | 60 57 40 | 68 77 53
88 73 63 | 26 39 15 | 17 25 40
59 43 45 | 91 62 62 | 94 94 57
51 39 30 | 57 31 49 | 64 30 30
47 27 29 | 63 77 45 | 29 45 50
79 66 32 | 60 55 41 | 52 63 61

-Guenter.


There are 14 numbers in The SF which always occur

----4-7--
-8-------
-1-------
---8----6
7--------
4-----2--
3---7----
---------
-----6-18

I can generate the 29 17s in the SF in 30 seconds [Thanks to Guenter]

If these are the latest analysis of the probability of clues then there must be more 17s !! These would be almost certainly be from a different "region" as Wolfgang says. Gfroyle may have 17s from different regions in the same grid. Await this.

Unless there is a stack of 18s & 19s for some reason which are fuzzing Dukoso's analysis. Wolfgang's analysis is independant of this however.

Deep analysis of the grid is the only solution. There may be more than 504 [9*8*7] of these for each grid.

Unless Guenter or Kjellfp come up with a magic program to minimize the unavoidables....soon please

C
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

Postby dukuso » Tue Oct 25, 2005 7:00 am

Coloin wrote:
>Getting these unavoidables sorted is definitly the way forward.
>
>When you have a puzzle and you deduce by logic that another clue
>can be inserted......does it mean that this clue is in an unavoidable
>set which has already been covered - but not in one which hasnt ?

when you have a valid sudoku with one solution, then of course
all unavoidables are covered - or better : are met or are intersected.
When you have a multi-solution puzzle, then "unavoidable"
has no meaning at all, it's not defined.
But yes,when you fix one grid G and you have a multi-solution-puzzle
and one of the solutions is G and then you find another
forced clue, this clue can't be in a new unavoidable for G,
hitting a new (minimal) unavoidable of one of the solution-grids reduces
the number of solutions.

>Also bearing in mind that the SF has 29 solutions with a backbone
>of 14 or so clues - is it not surprizing that these clues arnt more
>spectacularly reflected in the random generation of puzzles with 21
>or less clues. This must mean there are more 17s in the SF.....?

but we did use a canonical process to recover these 17s.
This was difficult because of the "well hidden" backbone.
Still we didn't find any others.
I'd say this indicates that there are no other 17s in sf.

------------------------------

Wolfgang wrote:
>Though the program is still in construction and not fully automised,
>my approach for searching for low clue sudokus in given grids seems
>to be rather effective now, but it is very slow, ie needs days for
>one grid (on my notebook).

you mean, in a few days you can definitely decide whether a grid
has a 17 or not ? Or just with high probability ?

>I tested it with three of the known 17-clue grids (sf, one of the
>twin grids and now the grid without unavoidables for 2 or 3 different
>digits and 3 known 17-clues, which - forgive me - i call NU23 now).

I had an error in that program. There are unavoidables with 3 digits.
But not with 2, and the grid is still unique with that property
in Gordon's list.

>In all of the grids i did find (known) 17-clues. Whereas in the
>first 2 grids i already found one in the region of the 2nd 19-clue
>that i detected directly with weighted random clue sets, i needed
>about 20 19-clues, until i had the region of the 17-clues in NU23.
>With a region i mean, that 2 sudokus belong to the same region,
>when they have say 14 clues in common.
>In ssf (similar to strangely familiar) i only found 16 18-clues
>in two regions. But the probability that the program does not
>find an 18 clue is higher than for a 17 clue, because normally
>there are much less 19- and 20- clues around it.
>A 16 clue should be very attractive for my method with all the
>17-,18- and 19-clues around. So it is unlikely for me that one
>of the grids would contain one.

Gordon has 24620 17s but no 16, so we already know 16s are rare.
Maybe both of your methods only find the "attractive" 17s ?

>My current impression is the following:
>- there are less grids containing a 17-clue than i thought

how many then ? Less than a million isomorphism-classes ?

>- it is unlikely that a grid has two 17-clues in different
> regions (are there any in the known grids with multiple 17-clues?)

Gordon didn't use the same representatives when grids are isomorphic.
I don't know how to search this easily

>- then it does not make much sense to search in known 17-clues grids

well, Gordon probably already examined these. But where else to search ?

>- we cannot say if a grid is promising without making a rather deep
> search. Criteria like the number of unavoidables, unique pairs,
> random clue set results, minimum number of clues for unique
> 3-rookeries are not more than a hint.

a hint is good

> There are grids with poor
> such properties that contain 17-clues and others that look much
> better that dont seem to have one.

yet in some grids i.e. when some particular clues are fixed we get
10 times more 20s per time than in others.
Even if it were true that these are not more likely to contain
a 17, then we would still find it earlier, if it is there.


-Guenter
dukuso
 
Posts: 479
Joined: 25 June 2005

Postby dukuso » Tue Oct 25, 2005 7:07 am

I corrected some bugs in the programs to find unavoidables at
http://magictour.free.fr/unav27.exe
http://magictour.free.fr/unav36.exe

I try to give some stats again. Still not sure they are correct,
but more likely to be correct than to be incorrect, I guess.

number of unavoidables in

3rookery or chute
--------------
sf:687
sfb=NU23:750
Gordon's list:351-907 , average 600
random grid:350-750 average 530


3rookery(84) or chute(6) or B1245-equivalent(9)
-------------------
sf:2520
sfb:3281 (!)
Gordon's list:about 1000-3300 , average 1750
random grid:about 700-2600, average 1400


so our best grids have many unavoidables !
dukuso
 
Posts: 479
Joined: 25 June 2005

Postby Moschopulus » Tue Oct 25, 2005 10:03 am

It still seems like a lot of people are putting energy into searching randomly for a 16. Some very interesting data and ideas.

Could I suggest (as I have suggested before) instead putting this energy into finding a way to *exhaustively* search a grid for a 16. That way, we either find a 16 or we *prove* that there is no 16 in that grid.

I think this is the way forward. Partly this is because it seems likely (in my opinion) that no 16 exists, and therefore we need ways to actually prove this. Random searching will never prove anything (unless it finds a 16, which it probably won't). People have put a lot of time up to now into clever random searching, with no 16 found.
I don't mean to be critical! Of course random searching is the first thing to try. I feel it has now been tried and it hasn't worked. But perhaps some of you feel there is more mileage in it, that you have ideas not yet tried.

The holy grail is the following:
1. Have a program which checks definitively whether a given grid has a 16
2. Run this program on one representative from each class of the 5e9.

Just my 2 cent.
Moschopulus
 
Posts: 256
Joined: 16 July 2005

Postby dukuso » Tue Oct 25, 2005 12:11 pm

Moschopulus wrote:The holy grail is the following:
1. Have a program which checks definitively whether a given grid has a 16
2. Run this program on one representative from each class of the 5e9.

Just my 2 cent.



no problem.If you are willing to run that program long enough...

1. could be done maybe in some months with some optimization.
I doubt it can be fast enough, so that 2. would finish in our lifetime.
dukuso
 
Posts: 479
Joined: 25 June 2005

Postby Wolfgang » Tue Oct 25, 2005 2:12 pm

dukuso wrote:you mean, in a few days you can definitely decide whether a grid
has a 17 or not ? Or just with high probability ?

I cant say how high the probability is. But i found the 17-clues in sfb twice from different 19-clues from different random starting sudokus.

Maybe both of your methods only find the "attractive" 17s ?


Thats true for my method, but it is unlikely that round a 17 clue there are less 18/19/20 clues than elsewhere. The program tends to converge to regions where it finds much of them.
The other side is that the space of a grid is very big, so the question is, after which time i can be rather sure that i searched all interesting regions deeper, maybe some days, maybe some months...

how many then ? Less than a million isomorphism-classes ?


no idea, but i had thought that any grid with good "properties" would at least have one 17-clue. Now it seems to me that a very special and rare structure is needed for a 17-clue whereas the rest of the grid (all the other regions) might be poor of low clue sudokus or not. This is my explanation that gfroyles method is the only successfull one, because it only cares about the neighbourhood of low clue sudokus, not the irrelevant rest of grids.

Moschopulus wrote:Could I suggest ... finding a way to *exhaustively* search a grid for a 16. That way, we either find a 16 or we *prove* that there is no 16 in that grid.

I fear it is boring, when we have a program run for months and then - as expected - it doesn't say more than "none". It would be another thing, if we could search all 17- and 18-clues also, then we really would know more after it, but this would take much more time.
Wolfgang
 
Posts: 208
Joined: 22 June 2005

Postby Wolfgang » Tue Oct 25, 2005 7:26 pm

What are the "poorest" grids containing a 17-clue? Maybe its easier to find good properties in them.
Wolfgang
 
Posts: 208
Joined: 22 June 2005

Postby kjellfp » Wed Oct 26, 2005 6:12 am

coloin wrote:Unless Guenter or Kjellfp come up with a magic program to minimize the unavoidables....soon please


If my plans are of any interrest, I can say that I'm currently working on a program to confirm the S- and T-classes of 3x3-sudoku, then I'm planning to crunch the 4x3-number, and so go for the 16-clue task.

I'm not doing fast progress, as this has to be done in my spare time, I'm a family father and currently have a guest in house that occupies the room of my PC. But I have almost finished debugging the S-/T-class program.

But for the 16-clue task: Those who now have experience, what is the apparently fastest algorithm for finding solutions to a clue-set? The setting I'm thinking of is when there are many solutions (thousands, millions), this might be different from the best algorithm for an assumed unique-solution sudoku.

I know there are many techniques when solving sudoku, but I guess most of them are time consuming, and give less than they cost when the solution space is big.
kjellfp
 
Posts: 140
Joined: 04 October 2005

PreviousNext

Return to General

cron