## Minimum number of clues

Everything about Sudoku that doesn't fit in one of the other sections
gfroyle wrote:But of course to most people "symmetric" means that it should have a NICE symmetry, like a rotation or maybe a reflection, or maybe both... In general, I think that a group element has to have very few fixed points to qualify as a "nice" symmetry.

In this case we are (currently) out of luck

OK. It's been my cursory observation that puzzles which have "nice" symmetry rarely, if ever have low (sub 20) starting counts too.

gfroyle wrote:Now, just for MY curiousity - if I send you the file of 1000+ puzzles, can your program quickly determine which of them are interesting games-wise?

It would need some minor modifications, but then it would quite easily generate a report on the degree of difficulty of each puzzle. However, please don't send them just yet .
angusj

Posts: 306
Joined: 12 June 2005

Nick70 wrote:
gfroyle wrote:308 x (1^3, 2^3, 3^3)
301 x (1^2, 2^6, 3^2)

Those two would imply 18 and 20 clues respectively. The last one 10 numbers even instead of 9.

BTW the notation is really confusing from a mathematical point of view, something like this would be better:

79 x (1x0, 2x1, 3x2, 3x3)

Blah... just shows that I should NEVER try to transcribe something by hand. Here is the output from my program which should be clear in all ways.. the numbers have changed slightly because more new 17s have been found in the interim...

Code: Select all
`     80 0 1 1 2 2 2 3 3 3     303 0 1 2 2 2 2 2 3 3      58 0 2 2 2 2 2 2 2 3       1 1 1 1 1 1 3 3 3 3      28 1 1 1 1 2 2 3 3 3     308 1 1 1 2 2 2 2 3 3     303 1 1 2 2 2 2 2 2 3 `

The notation that I was attempting to use (2^3) is actually the normal mathematical notation for a multiset - a "set" which can contain some objects more than once. In a printed maths paper, the 2^3 appears as a 2 with a superscript 3, thus indicating "2 occurs 3 times".

But I agree that it doesn't translate well to a more general forum.

Thanks for spotting the error..

gordon
gfroyle

Posts: 214
Joined: 21 June 2005

gfroyle wrote:Things that are "different" from all the others are often interesting mathematically, so here is the single example of type (1^5, 3^4) - notice that there is one each of {1,3,4,8,9} and three each of {2,5,6,7}

Code: Select all
`+ + +  + 6 +  + 3 ++ + +  8 + +  2 + +2 + +  5 + +  + + ++ + +  2 9 +  5 + ++ 7 6  + + +  + + ++ + +  + + +  1 + ++ + +  + + 7  + 6 45 + +  + + +  + + ++ + +  + + +  + 7 +`

How does this rate as an actual puzzle? (I have not yet had time to code up all the "rules" and figure out how to assign a rating to a puzzle...). Probably it cannot be done just using the rules without some backtracking...

Actually, you're wrong - it's a nice, easy little puzzle. Did it using a selection of the more simple logical techniques, no backtracking in sight. :)

EDIT: Briefly starting the puzzle over, I recognise that this is essentially because the first steps - increasing the numbers of 7s, 5s and 2s, of which there already are a few - are pretty easy. Once we have a few of those, it gets easier and easier.
Hammerite

Posts: 44
Joined: 20 June 2005

### 17 going on 16 !!!!!!!!!!!!!!!!!!!

gfroyle wrote:By now, I have collected 1075 different (ie genuinely different, not just relabellings or rearrangements) uniquely completable 17-hint grids, and about 78000 uniquely completable 18-hint grids.

Thanks - that is interesting work.

Supplementary Qs

Are some of these 17 - hint grids not doable if you row swap/column swop ? [this wouldnt make sense to me]

Has a grid only one way to do a 17 or can a grid have several 17s in it ?

If there are several 17s in any one grid then this increases the liklihood of a 16 - perhaps ?

Regards
coloin

Posts: 1714
Joined: 05 May 2005

I believe I can answer the first two...

The first one: No, if a starting grid has a unique solution then on swapping rows or columns without moving them between boxes, or on swapping entire rows or columns of boxes, the resultant starting grid will still have a unique solution. Perhaps you can see why just by imagining in your mind's eye what happens when a single one of these things, say interchange of two columns, happens - the logical processes valid for the rows are unchanged, and certainly those valid for the columns are too. The puzzle remains essentially the same.

The second one: Yes, gfroyle has mentioned (on page one or two, I forget which) that three of his original 12 17-hint starting grids, which are mathematically distinct, each have the same one final grid for their respective unique solutions.

(The third one: Well, maybe, but I don't really feel confident to anwer that.)
Hammerite

Posts: 44
Joined: 20 June 2005

Hammerite wrote:Actually, you're wrong - it's a nice, easy little puzzle. Did it using a selection of the more simple logical techniques, no backtracking in sight.

I hadn't even tried to actually solve it manually, but was just assuming - obviously incorrectly - that "fewer clues make harder puzzles". It is interesting that this is not the case..

In fact, maybe for these 17s, it might be the case that MOST of them are easy, because maybe there needs to be a lot of stuff forced quickly in order for them to avoid multiple completions...

Must code up some of the simpler known techniques and see how they perform on these puzzles..

Cheers

gordon
gfroyle

Posts: 214
Joined: 21 June 2005

### 17 going on 16

Can someone write a program to sort this question out ?

take a valid grid [81 numbers]
randomly get rid of 65 [81-16] numbers [perhaps always using at least 8 different numbers in at least 7 different boxes]
check if it gives a unique solution - guesses not allowed - unless it invalidates.

Why not start with the 81 grid which has the 3 different 17 hint grids

Might this work if you set it running all weekend ! ?
coloin

Posts: 1714
Joined: 05 May 2005

### Re: 17 going on 16

coloin wrote:Can someone write a program to sort this question out ?

Might this work if you set it running all weekend ! ?

You need some idea of the scale of the undertaking... given the 81 numbers in a particular grid, there are 33594090947249085 different ways of selecting 16 of them (with no restrictions). Obviously for the resulting puzzle to have a chance of having a unique solution, we can avoid all the subsets of size 16 that have (a) fewer than 8 distinct numbers, or (b) two empty rows or columns in the same band. This would reduce the number dramatically, but my guess is that it would still be way too high to explore more than a minuscule fraction of them in a weekend.

Then of course, we have the problem that the grid chosen may simply not have ANY 16-hint uniquely completable subgrids...

As an aside, I now have just under 2000 uniquely completable different 17-hint puzzles and just over 200000 (ie 2 x 10^5) uniquely completable 18-hint puzzles, none of which contain a uniquely completable 16-hint puzzle...

Still working on it!

Gordon
gfroyle

Posts: 214
Joined: 21 June 2005

### Greedy algorithm

coloin wrote:Can someone write a program to sort this question out ?

Might this work if you set it running all weekend ! ?

I'd already tried this a while back for symmetrically-placed clues (from a particular grid) and it was hopelessly slow. Nice idea though.

How about an alternative greedy strategy. Suppose you have a measure, which will have to be quite naive, of how "open" a partially-clued grid is: the measure should be small when the clues admit a unique solution and large when they admit many (e.g. when there are no clues!). Now pick a candidate grid to work on. Repeatedly make the move that "closes up" the grid as much as possible until you hit a unique solution. Report the number of clues used, then try another grid.

In practice, the repeatedly-make-a-move stage might want to look one or two moves ahead, especially around the 15-clue point. It should probably also make an effort to try all equally-constraining moves at each level of the search, making this a backtracking algorithm, but ... we can worry about the details another time.

So, what grids should we pick? It's already been pointed out that we shouldn't waste too much time on any single grid in case it simply doesn't have the sparse puzzle we're looking for. Also, we'd like to try as many different "types" of grid as possible, whatever that means, to help us identify the most promising ones. So I'd suggest we try transformation-invariant grids from the 27 or so interesting classes in Frazer's list at http://www.sheffield.ac.uk/~pm1afj/sudoku/sudgroup.html.

This could be quite good fun. If someone codes up the framework to do the search algorithm then we could try all sorts of different suggestions for the "openness" measure and see which ones perform best. One such measure might be the product of the sizes of the nine per-digit template lists (to use an example from my own code); or maybe something based on Knuth's insights (http://en.wikipedia.org/wiki/Talk:Sudoku#Don_Knuth.27s_Insights) or ... well, why don't you suggest something!
Red Ed

Posts: 633
Joined: 06 June 2005

### Re: Greedy algorithm

Red Ed wrote:One such measure might be the product of the sizes of the nine per-digit template lists (to use an example from my own code);

What are your "per-digit template lists" ?

Cheers

Gordon

PS I have more to say about this idea, but am in the middle of an experiment which is not yet finished..
gfroyle

Posts: 214
Joined: 21 June 2005

### Re: Greedy algorithm

gfroyle wrote:What are your "per-digit template lists" ?

A template for the digit 1, say, is any valid layout of the 1s in the grid; so 46656 templates in a list for an empty grid, or 5184 if you happened to have fixed the position of one of the digits.

I'm starting to wonder if this isn't completely the wrong approach though, as Guenter is doing some pretty amazing things over on the grid enumeration thread ...
Red Ed

Posts: 633
Joined: 06 June 2005

### Re: Greedy algorithm

Red Ed wrote:I'm starting to wonder if this isn't completely the wrong approach though, as Guenter is doing some pretty amazing things over on the grid enumeration thread ...

I think that the two problems (enumerating complete grids, and finding the minimum number of clues) are quite different in nature, so that as far as I can tell progress in the former problem will not help the latter.

Of course, it is entirely possible that Guenter, or someone else, will come up with some new ideas for the minimum clue problem, but it will require something pretty spectacular for a proof.

I firmly believe that there are 16s out there somewhere, but they are incredibly sparse... I now have over 350000 distinct 18-clue uniquely completable puzzles and about 2500 17-clue ones. A ratio of 100:1 from 18 to 17, but of course this is likely to be an artefact of my construction procedures and not a legitimate value.

Here is a question: take a valid grid G, and consider the proportion of m-hint subgrids that are uniquely completable.... Let p_G(m) be this function (so in other words p_G(0) = 0, and p_G(81) = 1). Can we say anything at all about this function? Is it significantly different for different grids?

By randomly selecting a largish number, say 10^7, random 27-hint subgrids and checking how many are uniquely completable (usually very few), we can get an estimate for p_G(27). If we have two grids G and H where p_G(27) > p_H(27), then can we conclude that G is a "more fertile" source of uniquely completable subgrids of other sizes, and thus focus more attention on that grid?

Cheers

Gordon
gfroyle

Posts: 214
Joined: 21 June 2005

### 17 going on 16

Gfroyle - You are triying hard for the 16 - keep going.

However the minimum hint problem - if it were going to be relevant to the calculation of the number of grids - has one "slight" flaw.

We know that one particular grid can be shown to have at least 3 distinct 17-hint puzzles. These 17 of course each have 64*[17 + 1 superflurous] 18 - hint grids and [64*63] 19 hint grids ..... and so on . So there isnt a great deal of point in trying to calculate these numbers - for one grid !.

However , if every grid has ONE lowest-hint puzzle then each minimum hint puzzle represents uniquely one grid. I think this must be extremely unlikely however.

Trying to find a 16 - Your work and Red Eds suggestions will be revealing.

Number crunching by filling in the first 16 numbers of, for example, the 17-hint final grid will not work even if you limit it by
-number frequency allocation of the 16 numbers :1x2,2x4,2x3.
-box allocation - 111222223 numbers in each box.
it doesnt reduce it significantly,I thought it did for a while but unfortunately it doesnt.

I await
coloin

Posts: 1714
Joined: 05 May 2005

> gfroyle wrote:
> Red Ed wrote:
>
> I'm starting to wonder if this isn't completely the
> wrong approach though, as Guenter is doing some pretty
> amazing things over on the grid enumeration thread ...
>
>
> I think that the two problems (enumerating complete grids, and
> finding the minimum number of clues) are quite different in
> nature, so that as far as I can tell progress in the former
> problem will not help the latter.

a low number of classes might not help in your hunting for a
16-clue-sudoku. But it could be useful when we are trying
to prove that no such thing exists.

> Of course, it is entirely possible that Guenter, or someone
> else, will come up with some new ideas for the minimum clue
> problem, but it will require something pretty spectacular for
> a proof.
>
> I firmly believe that there are 16s out there somewhere, but
> they are incredibly sparse... I now have over 350000 distinct
> 18-clue uniquely completable puzzles and about 2500 17-clue

I tried it I only got about one 18-clue per hour.
How do you do it ?

> ones. A ratio of 100:1 from 18 to 17, but of course this is
> likely to be an artefact of my construction procedures and not
> a legitimate value.

the 17 to 16 ratio seems to be much smaller than that,
doesn't that indicate that there is no 16 ?

> Here is a question: take a valid grid G, and consider the
> proportion of m-hint subgrids that are uniquely
> completable.... Let p_G(m) be this function (so in other words
> p_G(0) = 0, and p_G(81) = 1). Can we say anything at all about
> this function? Is it significantly different for different
> grids?

I would assume this, since the 44 chute-classes don't appear
with same frequency in your 17s

> By randomly selecting a largish number, say 10^7, random
> 27-hint subgrids and checking how many are uniquely
> completable (usually very few), we can get an estimate for
> p_G(27). If we have two grids G and H where p_G(27) > p_H(27),
> then can we conclude that G is a "more fertile" source of
> uniquely completable subgrids of other sizes, and thus focus
> more attention on that grid?

probably. But I don't expect that it will produce 1000 times
more 18s, probably only 2 times more or 10 times more.
So this would not be too useful.

But assume you fix the grid G, can you calculate the minimum
number of clues required for this grid ?
If not exactly, then with good probability ?
Or a good lower bound ?
Or proof that no 16-clue exists for this grid ?

How long does it take ?

OK, here is the idea how to further reduce the 5e9 classes
to be examined :

define 2 grids as clue-equivalent, when they are RedEd-equivalent
or when one can be obtained from the other by permuting
symbols in a proper subsudoku or size 2,3,..,8.

A subsudoku S of size k of a sudoku G is a set of k*k cells from G,
k or zero cells from each row,column,block,symbol.

How many classes do we get ?

When we have such a subsudoku S then
minclue(G)=minclue(S)+minclue(G-S)

e.g.:

146 928 375
593 617 842
872 435 916

721 359 684
968 274 531
435 186 297

257 891 463
384 762 159
619 543 728

the 4 cells marked "*"

146 928 375
593 617 842
872 435 916

*21 359 68*
968 274 531
*35 186 29*

257 891 463
384 762 159
619 543 728

form a subsudoku of size 2:
74
47

One clue must be be in one of the "*" cells to guarantee
a unique solution. Else we could swap the symbols in the subsudoku
for a 2nd solution.

I would concentrate on sudokus with only few subsudokus
when I wanted to find a 16-clue-sudoku !
We could walk through the list of the 5e9 classes
and pick those without subsudokus...

Guenter.
dukuso

Posts: 479
Joined: 25 June 2005

### a fertile grid??

Here is an interesting collection of 29 17-hint puzzles... what do they have in common?

http://www.csse.uwa.edu.au/~gordon/sudokupat.php?cn=3

Cheers

Gordon

Guenter - I intend to reply in depth to your last comments, but it is late now (here in Australia)
gfroyle

Posts: 214
Joined: 21 June 2005

PreviousNext