Grid containing a 20 and no 19

Everything about Sudoku that doesn't fit in one of the other sections

Postby gsf » Thu Jan 19, 2006 5:14 pm

Wolfgang wrote:Unfortunately i dont have any (time or hardware) resources for testing something time consuming. This is one example of an invalid sudoku, for which my solver needed 1000 times longer than for an average sudoku:
Code: Select all
050000610090000400020500000000006700000000000000100040004000070003000000000025090

The clues are that "lose" that my solver needs an immense amount of backtracking to verify that it has no solution.
If your solver also takes a long time, you maybe should count the valid and invalid sudokus you have to check and look at the average times needed.
I have searched regions with hundreds of such time consuming sudokus.

using only the 2 simplest constraints requires a lot of backtracking (465,061 guesses in my solver)
if you add in tuple constraints it is much better (13 guesses in mine)
with a 1 level forward check and the 2 simplest constraints there is no guessing (for this puzzle) and solving speed is about 1400/s/Ghz

I'll be posting source by week's end, maybe that can contribute to the search
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Postby Wolfgang » Thu Jan 19, 2006 6:29 pm

Yes, right, for those "exotic" sudokus more constraints really speed up the calculation enormously. But i think, in average you are faster with 2 or 3 constraints (my 3rd is to check if there is still a cell for each number in each unit) - without having verified it. And i dont know in advance, of which type a sudoku is. Maybe one should switch to more constraints, if a certain backtracking level is reached.
BTW, has someone implemented Donald Knuth's Dancing Links Algorithm for sudokus ?
Wolfgang
 
Posts: 208
Joined: 22 June 2005

Postby gsf » Thu Jan 19, 2006 7:18 pm

Wolfgang wrote:Yes, right, for those "exotic" sudokus more constraints really speed up the calculation enormously. But i think, in average you are faster with 2 or 3 constraints (my 3rd is to check if there is still a cell for each number in each unit) - without having verified it. And i dont know in advance, of which type a sudoku is. Maybe one should switch to more constraints, if a certain backtracking level is reached.
BTW, has someone implemented Donald Knuth's Dancing Links Algorithm for sudokus ?

2 constraints with 1 level forward check nails allmost all of the 9x9 sudoku I've scraped from the forums without backtracking
even then the ones that require backtracking generate backtrack node counts in the single digits
so I expect ~1000 to ~8000 /sec/Ghz solution speed for all inputs (stopping at first solution or no solution)

there are many refs to DLX (dancing links) w.r.t sudoku
the knee jerk reply suggests it is the best way to solve sudoku
but Knuth or not, I think it may be overkill for 9x9 sudoku, and possibly for higher order sudoku and QWH too
a grid/backtrack state with no links is so small that wholesale copies for new levels may nullify all that pointer fiddling

I guess this is a challenge
anyone with a fast DLX solver?
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Postby coloin » Sat Jan 21, 2006 7:04 pm

There is still much that is not understood regarding sudoku grids.

The invalid sudukus whoch halt solver program is an interesting phenomenon...... perhaps a valid sudoku is lurking near ?

However it cant be the cause of the disparity in Checker - as it is checking through clues from a single valid grid.

It is a shame that checker will take 137 days to get an 18 from grid 3 ! - and that is with one fixed clue.

In Checker is it possible to stategicly fix 5 or so clues ???? this would bring the time down to generate a positive, althought certainty would be lost for a negative.

Any one progressing with the set cover program ?
C
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

Postby gsf » Sat Jan 21, 2006 7:13 pm

coloin wrote:It is a shame that checker will take 137 days to get an 18 from grid 3 ! - and that is with one fixed clue.
C

I haven't looked at the checker innards
is it a solver in the innermost loop?
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Postby coloin » Sun Jan 22, 2006 6:16 pm

I think so.

Here are the clue completions for the random grids.
Thanks to Ocean for the 19 in No. 5 and trying in vain No. 3.
No 1
Code: Select all
123749568456183927789625431238567149561498273947231685874952316615374892392816754
2 19s
...7..5..4....39...8...5...2...6..4.....9.2....7.........9.....6.5......3...1...4
...7..56.4.....9...8..25...2......4.....9......7...6.....9.......5.....23...1...4

No 2
Code: Select all
123849567456217839789356124642598713971463285538172946397681452865724391214935678
1 19
..3.4.....5..1..39.......2....5..7.39..........8.........6..4.....7.....21...56..

No 3
Code: Select all
146328975785196342329457861493561728218973654657284139564839217832715496971642583
2008 + 19s
3 18 clue pseudopuzzles
+---+---+---+
|1..|..8|...|
|...|.9.|...|
|.2.|...|.6.|
+---+---+---+
|.9.|...|.2.|
|...|.7.|...|
|...|.8.|1..|
+---+---+---+
|.6.|...|..7|
|8..|..5|4..|
|..1|6.2|...|
+---+---+---+ 2 grid solutions and 19 clue completions

+---+---+---+
|...|..8|9..|
|7..|...|3.2|
|.2.|.5.|...|
+---+---+---+
|...|.6.|...|
|..8|...|.54|
|..7|2..|...|
+---+---+---+
|.6.|...|2..|
|...|7.5|.9.|
|...|...|...|
+---+---+---+2 grid solutions and 10 grid completions

+---+---+---+
|.4.|3..|...|
|...|...|...|
|..9|.5.|86.|
+---+---+---+
|...|...|7..|
|.1.|...|..4|
|6..|.8.|...|
+---+---+---+
|...|...|.1.|
|8.2|7.5|...|
|.7.|...|..3|
+---+---+---+2 grid solutions and 6 grid completions

Confirmed by checker320 not to have an 18 clue puzzle !

No 4
Code: Select all
347981256582476193169523874896245317754318962213769485925137648478692531631854729
1 18
3...8.2.........9....5...748.6...3..............7.9..5....3.6...7.........1..4...

No 5
Code: Select all
325681974641379258789425613134968725962537841857142396416293587273854169598716432
2 19s
...6...7.....7.....8...5..3..4.68...........1..7.4..9..1...3..7...8...6.......4..
32...........7.....8...5..3..4.6.7..........1..7.4..9..1.2.3..........69......4..

No 6
Code: Select all
416235987275689431983147625391728546847516392562394718159873264624951873738462159
3 19s
..6.3...7.....94.....1....5...7..........639.5......1...9..3.6..........7.84.....

No 7
Code: Select all
856923714731456298492817356348291675925764183167385429273548961684179532519632847
3 19s
...9.......1.....84...1.3.....2..6..92.7......6....4....3.4.....8......2....3...7
...9.......1......4..81.3.....2..6..92.7......6....4....3.4.....8......2....3...7
...9.......1......4...1.3.....2..6..92.7......6....4....3.48....8......2....3...7


C
Last edited by coloin on Tue Jan 30, 2007 5:37 pm, edited 4 times in total.
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

Postby Ocean » Sun Jan 22, 2006 7:06 pm

Just some supplementary stats for grid 3:
146328975785196342329457861493561728218973654657284139564839217832715496971642583

Code: Select all
Frequency count (clue occurences in 1058 different 19s):
*-----------------------------------------------------------------*
| 322 1  252 4  439 6 |  69 3  238 2  203 8 |  15 9  116 7  421 5 |
| 542 7   50 8  220 5 |  53 1  858 9   67 6 | 133 3  109 4   42 2 |
|  48 3  188 2   56 9 |   3 4  148 5   69 7 | 237 8  873 6  602 1 |
|---------------------+---------------------+---------------------|
|  17 4  154 9   48 3 | 185 5   69 6   45 1 | 600 7  739 2  104 8 |
|  10 2  820 1   85 8 | 157 9  438 7   31 3 | 112 6   21 5  335 4 |
| 128 6  568 5   83 7 |  97 2  774 8   48 4 | 247 1   76 3   99 9 |
|---------------------+---------------------+---------------------|
| 238 5  336 6  103 4 | 278 8   31 3  419 9 | 135 2  259 1  189 7 |
| 913 8  680 3  229 2 |  52 7  157 1  247 5 | 706 4  170 9   77 6 |
|  86 9   31 7  192 1 | 860 6  109 4  678 2 | 240 5   14 8  210 3 |
*-----------------------------------------------------------------*

Most frequent is the '8' in r8c1 (913 times), and least frequent is the '4' in r3c4 (3 times).

Three 'disjoint pairs' of 19s (from a total of 30 such pairs found in grid 3). No common clue, but identical solutions:
#1:
006000005700006000000050001000000720010000600050280000000009000830700400000000080
040020000000090000020000860003000000000070004600000000500800010002005090001600003
#2:
006020000000090040000000800000000000010970004000000039500800010802005000000600003 100000005700000000000050061000001720008000000050080000060009000030700400900002000
#3:
006020070080000000000057001000000000010070004000000009500809010002000000000600083 100008000000090000020000060000060020000000050000080130060000007800015400070002000

The nine unavoidable sets of size 4:
000000000000000000000000000000000000208000000000000000000000000802000000000000000
000000000000000000000000000000000000000000000650000000560000000000000000000000000
046000000000000000000000000000000000000000000000000000064000000000000000000000000
000000000005090000009050000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000060700000070600000000000000000000000000000000000000
000000000000000000000000000000000000008070000007080000000000000000000000000000000
000000000000000000000000000000000000000000000000000000004000200002000400000000000
000000000000000000000000000000000000000000604000000000000000000000000406000000000
000000000000000000000000000000000000000000000000000000000000000800000090900000080

[Edit: Some nonsense removed]
Last edited by Ocean on Sun Jan 22, 2006 11:49 pm, edited 2 times in total.
Ocean
 
Posts: 442
Joined: 29 August 2005

Postby Moschopulus » Sun Jan 22, 2006 11:35 pm

coloin wrote:In Checker is it possible to stategicly fix 5 or so clues ????


We will add this soon, when we find the time.

gsf wrote:I haven't looked at the checker innards
is it a solver in the innermost loop?


Yes. Checker enumerates all possible sets of n clues (if you search for an n clue puzzle) that hit all unavoidable sets in the collection, and runs each one through a solver counting 1 or >1 solutions. We use dukuso's solver.
A faster solver would speed it up, although dukuso's is pretty fast I think.

Actually some sets are enumerated twice. Couldn't think of a good way to do it. Another way to speed it up is improve this part.

Ocean wrote:What is the correlation between occurence of clues and the unavoidable sets, if any?

Other than the obvious one, there must be a clue from each unavoidable, I don't know. Not sure what you mean.
Moschopulus
 
Posts: 256
Joined: 16 July 2005

Postby Ocean » Mon Jan 23, 2006 12:21 am

Moschopulus wrote:
coloin wrote:In Checker is it possible to stategicly fix 5 or so clues ????
We will add this soon, when we find the time.

This is a good idea! The advantage of fixing a number of clues, is that it allows to split a search between machines, between persons, or over time - and also make a priority search. Fixing N clues can be done in X = 81!/((81-N)!*N!) ways, so time saving will be substantial. For a complete search the program must be run X times though.
Ocean
 
Posts: 442
Joined: 29 August 2005

Postby Moschopulus » Mon Jan 23, 2006 11:44 am

Ocean wrote:This is a good idea! The advantage of fixing a number of clues, is that it allows to split a search between machines, between persons, or over time - and also make a priority search. Fixing N clues can be done in X = 81!/((81-N)!*N!) ways, so time saving will be substantial. For a complete search the program must be run X times though.


You can fix one clue already. We did that for precisely the reason you say, so that the search for a 16 in SF could be divided into four pieces. Each piece takes 1-2 days currently for SF.
Moschopulus
 
Posts: 256
Joined: 16 July 2005

Postby coloin » Mon Jan 23, 2006 3:01 pm

Thanks Ocean...I was wondering what the clue distribution would be. Perhaps a way forward would be to find the best 10-12 clues which always occur together....perhaps you have already done this. There is no guarentee that an 18 [if there is one] will have these clues....but it makes sense if it does. Interestingly, the 18 clue "pseudosudoku" [described above] only has 8 of these "top" clues !

Fixing several clues will speed up checker....but I agree there is a problem deciding what the "best" 5 clues should be in this case. [In other grids it would be easier to guess correctly]

Your clue distribution for 19 clues is similar to the data generated when you use the program suexsf.
http://magictour.free.fr/suexsf.exe
However this analyses the clue distribution of randomly generated 22 clues so yours is much more specific.

All these clues here are above 500 in your frequency chart.
Code: Select all
+---+---+---+
|...|...|...|
|7..|.9.|...|
|...|...|.61|
+---+---+---+
|...|...|72.|
|.1.|...|...|
|.5.|.8.|...|
+---+---+---+
|...|...|...|
|83.|...|4..|
|...|6.2|...|
+---+---+---+ 14 clues =   248   clue completions with 5 extra clues



It is time limiting to go back furthur than 12 clues +6/7. with suexmult http://magictour.free.fr/suexmult.exe ....but I will run it if I get time.

Suexf could be improved .......fixing clues could be easier....at present you have to omit from the text line 3 clues out of a 4 set and this fixes the clue ! [But I cant do it any other way]

Usually a puzzle with less clues becomes evident when you trail through the iterations..but not here. In the sf if you use suexsf program the best 14 clues spring out immediatly. http://forum.enjoysudoku.com/viewtopic.php?t=2312&postdays=0&postorder=asc&start=15 You normally expect an 18 clue puzzle to have a few extra 19s around its backbone - and this manifests as extra clue completions.

Perhaps we shouldn't be so surprised by this, fascinating though it is but... A lot of our unanwered questions regarding the "why"s regarding the minimum number of clues could be answered by the answer "because thats just the way it is"

Code: Select all
There will be no grids with a 16 !

There will be grids with many 17s
There will be grids with a few 17s
There will be grids with only one 17

There will be grids with a few 18s etc

There will be a few grids with many many 19s...........grid 3 doesnt have an 18 ......  yet.
There will be many many grids with a few 19s !
There will be a few grids with only one 19

There are a few grids with only 20s .......the title of this thread !
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

Postby Ocean » Wed Jan 25, 2006 9:32 pm

coloin wrote:Thanks Ocean...I was wondering what the clue distribution would be. Perhaps a way forward would be to find the best 10-12 clues which always occur together....perhaps you have already done this. There is no guarentee that an 18 [if there is one] will have these clues....but it makes sense if it does. Interestingly, the 18 clue "pseudosudoku" [described above] only has 8 of these "top" clues !

Fixing several clues will speed up checker....but I agree there is a problem deciding what the "best" 5 clues should be in this case. [In other grids it would be easier to guess correctly]

Your clue distribution for 19 clues is similar to the data generated when you use the program suexsf.
http://magictour.free.fr/suexsf.exe
However this analyses the clue distribution of randomly generated 22 clues so yours is much more specific.


Found an additional 18-clue pseudopuzzle in grid 3 (2 grid solutions, 10 clues completions). Here are the two shown together:

10: 000008900700000302020050000000060000008000054007200000060000200000705090000000000
19: 100008000000090000020000060090000020000070000000080100060000007800005400001602000

The two pseudos have only four clues in common.

The new frequency diagram for 19s shows some surprising results (compared to the previous one), indicating that new 'dense regions' were reached. (How is this compared to suexsf?)
Code: Select all
Frequency count (clue occurences in 2008 different 19s):
*---------------------+---------------------+---------------------*
| 339 1  284 4  486 6 | 101 3  292 2 1075 8 | 234 9  295 7  592 5 |
| 948 7   65 8  603 5 |  68 1 1213 9  337 6 | 344 3  513 4  206 2 |
| 103 3  777 2   83 9 |  11 4  480 5  305 7 | 294 8  916 6 1024 1 |
+---------------------+---------------------+---------------------+
|  59 4  175 9   78 3 | 601 5  652 6   76 1 | 962 7  835 2  307 8 |
|  48 2 1191 1  836 8 | 309 9  530 7  123 3 | 243 6  489 5  799 4 |
| 182 6  655 5  322 7 | 980 2  819 8   56 4 | 623 1  165 3  142 9 |
+---------------------+---------------------+---------------------+
| 456 5 1076 6  315 4 | 338 8  205 3  542 9 | 748 2  350 1  228 7 |
|1229 8 1264 3  253 2 | 667 7  310 1  547 5 | 974 4  412 9  201 6 |
| 215 9   41 7  288 1 | 943 6  168 4  722 2 | 721 5  358 8  336 3 |
*---------------------+---------------------+---------------------*

Counting the number of disjoint pairs (pairs of 19s having no clues in common) now seems meaningless (6087 such pairs were found among the 2 million pairs, or 0.3%. 883 of the 2008 19s were members of at least one such pair, the most 'solitary' being a member of 133 disjoint pairs.)
Last edited by Ocean on Thu Feb 02, 2006 7:44 pm, edited 1 time in total.
Ocean
 
Posts: 442
Joined: 29 August 2005

Postby coloin » Thu Jan 26, 2006 2:08 pm

Thanks for the hard work put in in this analysis.
Excellent to get another pseudo puzzle.

We have here a grid with two "highish" regions.....amongst many not so high........

I think the large number of 19s "clouds" the value of the clue frequency analysis. This clouding is even more pronounced in suexf program. [The reason is that suexsf looks at 22 clue sudokus - it is difficult to randomly generate minimal sudokus less than 22 clues - so the initial suexsf analysis is very flat - all the clue frequencies are the same !]

If there was an 18 the popularity of the clues does not get bourne out in the statistics because there are so many other 19s which bury any increased clue frequency.

I am not able to fix clues in suexsf [dukuso wrote it and has cleverly moved on to more worldly and worthwhile tasks !] without omiting other lesser clues in the same small unavoidable set - but I know it is possible.

I will try it. The 4 clues in both pseudos will be the start.

From your figures what is the clue which occurs most frequently with these 4 clues ?

Fixing these 4 clues in checker and waiting might be the easier answer.

No guarentee of course that these 4 clues are right !
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

Postby Ocean » Fri Jan 27, 2006 3:49 pm

coloin wrote:I will try it. The 4 clues in both pseudos will be the start.

From your figures what is the clue which occurs most frequently with these 4 clues ?

Not sure if I understand your question right, but try this:

Among the set of 2008 19s, there are 258 19s containing all those four clues (in pos.6,20,56,69). The clue occuring most often together with those four, is the '2' in pos. 39 (=r5c3), occuring 215 times. (The next occur 210, 188, 180, 172, 172, 164 147, 109, 106. 103, 98, ... times.)

Code: Select all
Here is complete stat:
+------------------+------------------+------------------+
|  51 1   1 4   - 6|   3 3   8 2 258 8|  67 9  67 7 109 5|
| 103 7   - 8  41 5|   3 1 164 9  49 6|  71 3  98 4  42 2|
|  39 3 258 2  19 9|   2 4  49 5   1 7|   3 8  50 6  38 1|
|------------------+------------------+------------------|
|  17 4  51 9  15 3|  53 5 172 6  24 1|  28 7  53 2  56 8|
|   1 2  16 1 215 8|   7 9  34 7   6 3|  54 6 172 5 106 4|
|   2 6   4 5  45 7| 210 2  44 8   3 4| 147 1  50 3   4 9|
|------------------+------------------+------------------|
|   - 5 258 6  27 4|  11 8  50 3   5 9| 188 2   4 1  55 7|
|  84 8  79 3   - 2| 180 7  16 1 258 5|  83 4  93 9  20 6|
|  52 9  10 7  70 1|  39 6  11 4  44 2|  21 5  38 8  23 3|
+------------------+------------------+------------------+

PS. Almost found a disjoint sudoku-triple for grid3, but not quite. Found four triples where only one single clue occurs twice. Listing one of them:
#1
000000900000090340020000000000060000008000050057200000004800017000005006900002000
046000000000000000000057800000000700010900004000080100500030000802000090000640003
000028070005006000000000001000500000208003600000004000060000200030710000000000580
#
Ocean
 
Posts: 442
Joined: 29 August 2005

Postby coloin » Fri Jan 27, 2006 6:11 pm

Yes the 5th clue will be useful, presumably the clues which follow will be similar [but ideally one cant assume]

I will suexmult it with the 4 plus the 8 next best clues over the weekend......

I tried it with the 4 clues and the 14 from each pseudopuzzle - no 18, no new 19 either.

I apprececiate your efforts.......the fact that 19s are all over this grid makes you confident we will find an 18 ?

I will restart a general sweep with suexsf [which I havnt done] [usually you scan the one which has the lowest suexsfl [suexsfl] value

I do this by finding the 14 or 15 best clues by individually scanning the 9 text files of this grid where each file has one clue number removed - this forces the grids to be generated over one region I think. It seems to work.

[command line: suexsf [file.text] seed 5001]

[seed 5001 - it is divisable by 3 therefore it counts 22s]

[As an example try it with the SF grid with the 5s removed]
Code: Select all
63924178.28476.193.179836241238.79467964328.14.8619237342178.69861.9437297.326418

[suexsf sf.txt seed 5001 and see what happens !]

Regards
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

PreviousNext

Return to General