## Minimum number of clues

Everything about Sudoku that doesn't fit in one of the other sections
Coloin,
I played a little with your examples, but without success.
I didn't try too hard, though.
There's not so much motivation when you just got the
list with 7611 17-clue sudokus !
This did look more interesting/promising, ... until
I counted the solutions for the 16-puzzles,
generated some 15s and 14s with few solutions.
See my previous post.

Guenter.
dukuso

Posts: 479
Joined: 25 June 2005

dukuso wrote:I think I'll give up searching for a 16-clue-sudoku.
I examined Gordon's list and this looks rather
discouraging.
I.e. when you remove one of the 17 clues and count the number
of solutions, then you get 10-20 counts for each of the smaller
values, except 1,3 and 5 !
That looks as if there is some theoretical obstacle which hinders
him from finding these.

Guenter.

number of solutioncounts c for 16-clue-puzzles
generated by deleting a clue from
one of Gordon's 17-sudokus for c=1,2,..98 :

0 18 0 43 0 33 9 26 0 36
2 38 24 23 18 39 40 28 11 62
25 32 8 27 25 47 16 25 19 34
12 35 27 38 27 34 16 46 19 42
23 29 21 55 25 58 22 38 18 28
16 48 50 48 22 34 29 40 57 30
26 43 36 44 29 25 30 35 19 42
28 45 24 21 10 43 38 51 26 16
36 35 23 33 24 29 19 36 36 49
22 46 27 54 35 38 27 38

Suppose there exists a 16 clue Sudoku, the structure (the arrangement of the relative positions of the digits) of the Sudoku should be a bit special, and the clues alos need to be put in the right postion which will be crucial to find one. I wonder when you count the total number of Sudoku matrix, whether you come across a type of Sudoku which is special.

How about beginning with one kind of Sudoku (if there is) and find out different ways of putting down the clues and see the number of the clues needed for each method?

Posts: 62
Joined: 15 July 2005

Addlan wrote:Suppose there exists a 16 clue Sudoku, the structure (the arrangement of the relative positions of the digits) of the Sudoku should be a bit special, and the clues alos need to be put in the right postion which will be crucial to find one. I wonder when you count the total number of Sudoku matrix, whether you come across a type of Sudoku which is special.

How about beginning with one kind of Sudoku (if there is) and find out different ways of putting down the clues and see the number of the clues needed for each method?

The clues will be special and crucial - because they will be unique !

Dont Give up Guenter - we need you !

This is Gfroyles 2/29

with 17 clues - 1 solution

---24-7--
-8-----9-
-1-------
---8----6
7--------
4-----2--
3---7----
--------2
-----6-18

With 16 clues - 94763 solutions
---24-7--
-8-----9-
-1-------
---8----6
7--------
4-----2--
3--------
--------2
-----6-18

1 sol. 72 nodes 3 guesses 0/91sec
94763 sol. 1263314 nodes 96786 guesses 170/91sec

So it really all comes down to the last clue.

If every clue reduces the number of grids possible by at least 9 - and sometimes up to 12 - see my last posts - and sometimes if its a crucial clue by 94763.

multiplying
9^8*12^7*94763 = 146166416264118288384 [not quite Bertram's N]

maybe we do have difficulties!

But if there is 16 unique numbers - then there is a 16 - there are enough 17s around to make one 16 still possible.

I think Gfroyle has tried to remove one clue from every 17 and consequently - hmmmm

I am glad you thought there was mileage in my last post
Last edited by coloin on Tue Aug 16, 2005 11:00 am, edited 1 time in total.
coloin

Posts: 1702
Joined: 05 May 2005

dukuso wrote:7543 non-isomorphic graphs for Gordon's 7611 (solved)grids
7561 from 7941 when I add my 330 generated sudokus,
but some grids have more than 1 different sudoku.

I think we can test isomorphic sudokus the same way.
This gives

7611 non-isomorphic sudokus for Gordon's 7611 - list
220 non-isomorphic sudokus for my 330-list
7831 non-isomorphic sudokus for the joint 7831 - list.

So 220 of my 330 sudokus were new, but they were in grids
already known. Presumably some small changes of the clues
in a known 17-sudoku which gave "new" 17-sudokus.

Anyway, I don't think we will find a 16-sudoku.

Guenter.
dukuso

Posts: 479
Joined: 25 June 2005

coloin wrote:The clues will be special and crucial - because they will be unique !

Unique is not enough. Given a full Sudoku matrix, there are many ways to reduce it to only clues. I think first we need to understand how many ways there are, i.e. Given a full Sudoku matrix, how many different Hint Sudokus there are? (all these Hint sudoku will lead to the same full Sudoku)

And if it is not possible to find a 16 hint Sudoku, is there a way to prove it?

Posts: 62
Joined: 15 July 2005

coloin wrote:This is Gfroyles 2/29

with 17 clues - 1 solution

---24-7--
-8-----9-
-1-------
---8----6
7--------
4-----2--
3---7----
--------2
-----6-18

With 16 clues - 96786 solutions
---24-7--
-8-----9-
-1-------
---8----6
7--------
4-----2--
3--------
--------2
-----6-18

If you don't eliminate grid (7,5)=7, but (9,6), you only get 906 solutions. I think this is what I mean, different elimination schemes will have very different results.

Posts: 62
Joined: 15 July 2005

Unique is not enough. [/quote]

I think 16 unique clues is enough if we can find them - if they are there !

There is only one solution from 16 unique clues

Of course different elimination schemes will give different results - except in the end it will be the same. The point I was making was to illustrate that high solution numbers at 15 clues doesnt preclude a 16 - if it is there of course. It does make it less likely however.
coloin

Posts: 1702
Joined: 05 May 2005

coloin wrote:I think 16 unique clues is enough if we can find them - if they are there !

Right, but if they are not there, we need to proof it.

coloin wrote:Of course different elimination schemes will give different results - except in the end it will be the same.

In the end it won't be the same. If you first reduce a full Sudoku to 19 clues, and then next time with a different scheme (or randomly) reduce it to 20 clues. They are different. And even you reduce the same full Sudoku to same number of hint Sudoku, they should be different as well.

Posts: 62
Joined: 15 July 2005

Yes
What I meant was that it doesnt matter what order you enter the same clues.

But.....they wont be there if you are looking for them in grids which they are not in !

How has Gfroyle been choosing his grids to look in ?

I have been analysing solution rates for box various box combinations - and I have more to add.

The 4 box combination of B1245 has yielded interesting results on some grids - however it is difficult to manipulate/create these as the grids are predetermined after 2 or 3 of the 9 inherent combinations - so you are left with analysing random grids for possible "good" ones - if that is what they indeed are.

Another way might well be analysing the B123456 - described by me as "crude". [B123456 - [120,156,168,180,192,228,276,288 solutions and other totals - not big at all]
But we are able to generate the 6 double gangsters of the B123456 [3 horizontal doubles and 3 vertical doubles] in each sudoku grid. Guenter has documented all the possible 3 horizontal ganster possible grids -

"60713 out of the 85184=44^3 tupels of 3 members from the list can be joined to form a valid 9*9 sudoku. "

I dont know what happens to the vertical 3 gangsters in these cases.

If we know what two combinations of Gangsters give low solution rate of 120 [is this the minimum] or 156 or 168 then perhaps we could eek out a sudoku grid with 6 of these numbers in it.

This grid might well have a very low all round solution rate.

Gfroyle 16 - this looks the best
228 sol.
168 sol.
120 sol.
168 sol.
168 sol.
228 sol.

Logic grid
228 sol.
120 sol.
156 sol.
228 sol.
180 sol.
228 sol.

Random grid
228 sol.
168 sol.
288 sol.
276 sol.
288 sol.
192 sol.

Canonical grid
1728 sol.
1728 sol.
1728 sol.
1728 sol.
1728 sol.
1728 sol.

Is there a simple way of finding out which gansters are in any one grid ?

Edit - thanks for the r47

Unfortunately for this method it doesnt work. B123456 with the same double gangster elements gives different solution totals.

Colin
coloin

Posts: 1702
Joined: 05 May 2005

dukuso wrote:So 220 of my 330 sudokus were new, but they were in grids
already known. Presumably some small changes of the clues
in a known 17-sudoku which gave "new" 17-sudokus.

Gordon
gfroyle

Posts: 214
Joined: 21 June 2005

Hi, I'm new to the forum (and relatively new to sudoku!) but I have some ideas how to go about searching for 16's.

I think that a random search is probably not going to find it, as the large list of 17's show. I would guess that this indicates that 16's have special attributes (e.g. clues going in certain squares) that a random search is unlikely to run across.

So, I would propose using a computer program to gradually build the layout, rather than randomly trying grids and then attempting to solve them. Add the clues one at a time, at each step determining the clue that most reduces the solution space.

But, counting the number of boards containing a set of clues is computationally expensive! Second, it seems to not be a great indicator of how close the clues are to forming a complete puzzle, as eliminating one clue could lead to anywhere from 2 to 200000 solutions! So, I would suggest simply counting the possibilities for each square: there are only at most 729 possibilities that need to be considered. At each step, add a clue that eliminates the most possibilities. If the solver is fast enough, hopefully you can look ahead to determine which clues are best in the long run. You can also experiment with different ways of comparing numbers of possibilities, e.g. whether a square with 8 possibilities is worse, better, or equal to two squares with 4 possibilities each.

I threw together a quick program to do this, scoring each board by the sum of the number of possibilities for each square. At each stage I choose the clue that reduces the possibilities the most. However my solver is not fast enough. Although it will solve any board with a unique solution instantly, it seems to be very slow on boards with many possiblities (or no possibilities!). (All I programmed was basic elimination--it checks each row, column, block, and square for unique possibilities. Then it resorts to look-ahead.) Boards with few clues take about 1-5 seconds to process all possibilities, so it takes a long time to determine which clues lead to the fewest possibilities. But I think it shouldn't take more than a millisecond to determine whether a board is solvable, so even one second seems slow to check all possibilities for each square... Maybe someone with a fast solver would like to consider working on this method? Or has this method already been fruitless? After making a few boards I tend to get around 20 clues, with no 17's yet.

Anyway, I am hopeful that a 16 is found. Sudoku has so many variables that mathematically, I think a proof of nonexistence is unlikely for more than 12-14 clues, so we can only hope that 16's in fact exist
originalLuke

Posts: 1
Joined: 16 August 2005

Hi, I'm new to the forum (and relatively new to sudoku!) but I have some ideas how to go about searching for 16's.

I think that a random search is probably not going to find it, as the large list of 17's show. I would guess that this indicates that 16's have special attributes (e.g. clues going in certain squares) that a random search is unlikely to run across.

So, I would propose using a computer program to gradually build the layout, rather than randomly trying grids and then attempting to solve them. Add the clues one at a time, at each step determining the clue that most reduces the solution space.

But, counting the number of boards containing a set of clues is computationally expensive! Second, it seems to not be a great indicator of how close the clues are to forming a complete puzzle, as eliminating one clue could lead to anywhere from 2 to 200000 solutions! So, I would suggest simply counting the possibilities for each square: there are only at most 729 possibilities that need to be considered. At each step, add a clue that eliminates the most possibilities. If the solver is fast enough, hopefully you can look ahead to determine which clues are best in the long run. You can also experiment with different ways of comparing numbers of possibilities, e.g. whether a square with 8 possibilities is worse, better, or equal to two squares with 4 possibilities each.

I threw together a quick program to do this, scoring each board by the sum of the number of possibilities for each square. At each stage I choose the clue that reduces the possibilities the most. However my solver is not fast enough. Although it will solve any board with a unique solution instantly, it seems to be very slow on boards with many possiblities (or no possibilities!). (All I programmed was basic elimination--it checks each row, column, block, and square for unique possibilities. Then it resorts to look-ahead.) Boards with few clues take about 1-5 seconds to process all possibilities, so it takes a long time to determine which clues lead to the fewest possibilities. But I think it shouldn't take more than a millisecond to determine whether a board is solvable, so even one second seems slow to check all possibilities for each square... Maybe someone with a fast solver would like to consider working on this method? Or has this method already been fruitless? After making a few boards I tend to get around 20 clues, with no 17's yet.

Anyway, I am hopeful that a 16 is found. Sudoku has so many variables that mathematically, I think a proof of nonexistence is unlikely for more than 12-14 clues, so we can only hope that 16's in fact exist

Hi Luke.

I'm also a newbie contributor, although I have been lurking for a while reading the collected wisdom here. I agree with your basic idea, and I've been working on something along those lines myself. A few comments:

Your method is a greedy algorithm (see Gordon's post). I think this would require a good deal of luck to succeed, although I agree that there are much better heuristics than the number of solutions, and yours looks promising. I don't think there is any "best" way to combine the scores from different cells; the most you can do is try different ways and use the one that experiment shows is best. My idea is to keep a collection of the best configurations (say those that score more than x% of the best score, where x is close to 100).

To do it my way, it's essential to prune the symmetric equivalents right from the word go. There are 9! * 6^8 * 2 = 1,218,998,108,160 ways of permuting a complete Sudoku grid without changing its underlying structure (reference: Gordon's web page), and many of these will be non-trivial permutations even for an incomplete grid. So you run the risk of generating the same problem over and over disguised in thousands or millions of permutations unless you do something to weed out the symmetries. Some recent posts here discuss how to recognise symmetric equivalence. Avoiding this is certainly one of the attractions of the 100% greedy approach.

A solver like you have described, that resorts to exhaustive recursive search after applying only single inferences (what you call basic elimination), will really struggle to perform on the more difficult problems. I wrote such a solver as my first attempt, choosing the clue with fewest alternatives for my guesses at each recursion, as the book says you should. The worst case performance was 50 times slower than I'm getting with my new solver. (My new solver can handle any inferences that lie in a single plane of the 3-D solution space, including box-row/column intersection, tuples, "fishy cycles": X-wings, swordfish, etc, and nishio, before resorting to recursion).

You can improve performance on some zero-solution problems by checking for inconsistencies as part of your singular inference processing. Any cell with no possibilities left, or any attempt to wipe out a possibility that is actually the setting for a clue cell, is a sure sign of inconsistency (or a solver bug!). But there are some inconsistencies that will only become apparent at a later stage; this also is where sharper inference techniques are useful, but I conjecture that some inconsistencies may defeat these too and require recursion to detect them.

In general, the multiple solution problems should take less time than the really tough unique solutions. This is particularly true for small numbers of clues, and for a solver like yours that just gets on with it instead of intellectualising about the problem . Are you aborting the search as soon as two solutions are found?

My solver, running in Python on a laptop, can solve really easy problems in about 25msec. The hardest ones take a couple of seconds, but there aren't many like that in Gordon's 17 clue collection of 450. A fast solver is desirable, but a good heuristic or three for pruning the search is even more important.

Jeff
Trainspotter

Posts: 2
Joined: 10 July 2005

I like Luke's idea to just count the possibilities for each square.
I haven't yet tested it,
let's see, how well it correlates to the number of solutions...
I can send my solver by email (sterten(at)aol.com) to anyone who is interested
(except those from the running programming contest ;-) )
2sec. for Gordon's 7611 puzzles to solve, 9 sec. to count
the 230400 solutions in Coloins puzzle above.
This will probably be improved when that contest is over
in 9days and the programs are made public.

But, as I said I'm pessimistic for a 16-clue sudoku
dukuso

Posts: 479
Joined: 25 June 2005

I again messed up that calculation of new 17s using Gordon's list.
Now it's only 14 new ones - I forgot to remove my filled in clues.
And then I only have the graphs ATM

OK, here is a list of 17, 3 of which are not new,
but I don't know which

000010020340000000500400000000500600020100000070000000800000500000076000000020009
010000002034000000000500600070000810800600000000900000600000400700010000000030000
000120030045000000000000000600000780100405000000900000010000200200080000000000004
010000002034000000000500600070000430200800000000600000600000500800030000000010000
100203000000000405600000000700000610080040000000000000000600030040700000052000000
120300000000400560000000070000100300006000000050000000800069000300000200000070000
010000002034000000000500600070000830800600000000900000600000400700030000000010000
100000203040500000000000000000610050207000800900000000060000094000027000000000000
000010020300400000500000000620700000000300100080000000000000503000080400060020000
010203000000000405060000000000600020500700000408000000090000070000050600100000000
100000002000030040000005000000100006000700100030000000040000350700280000000000800
100000023030400000000000050000670400500000800002000000060090700000005010000000000
100203000000000405600000000700000610080040000000000000000600030040700000052000000
000010020340000000500400000000500600020100000070000000800000500000026000000070003
000010020300400000500000000620700000000300100080000000000000503000080400060020000
010203000000000405060000000000600030500700000408000000090000070000050600100000000
100000002000030040000005000000100006000700100030000000040000350700280000000000800
dukuso

Posts: 479
Joined: 25 June 2005

Good to welcome more entheusiasts.

I too like the idea of removing all the symetrical and permutted equivalent grids and others effectivly similar from the solution total. What total do we think this is ?

We are searching for one grid which has a unique combination of 16 clues.

As we have said before
1 Find the grid - not easy
2 Find the 16 - even less easy

Note I think it is not the effect of one clue which is important - it is the "synergistic" effect of all 16 clues which ultimatly identifies the grid.

Recently I posted a grid with pairs/chains and with 3 more "important" clues I was able to define the grid in question.
This is the solutions before the final clue - reducing it from eleven to one solution[the top one] the 9 in B1 [11th number along] identifies it as unique from the others.

832591674496387251571264983185746392267953418943812765714638529329175846658429137
832591674946387251571264983185746392267953418493812765714638529329175846658429137
832591674946387251571264983185746392763952418294813765417638529329175846658429137
892531674436987251571264389185746932267359418943812765714698523329175846658423197
892531674436987251571264389185746932769352418243819765914678523327195846658423197
892531674436987251571264389185746932967352418243819765714698523329175846658423197
932581674846379251571264389185746932763952418294813765419638527327195846658427193
932581674846379251571264983185746392763952418294813765419638527327195846658427139
932581674846397251571264983185746392267953418493812765714638529329175846658429137
932581674846397251571264983185746392763952418294813765417638529329175846658429137
932581674846937251571264389185746932267359418493812765714698523329175846658423197

If only we could get a program to do this on a multiple clue basis on a grand scale !!

Im starting to get pessimistic too

Colin
coloin

Posts: 1702
Joined: 05 May 2005

PreviousNext