## Structures of the solution grid

Everything about Sudoku that doesn't fit in one of the other sections
daj95376 wrote:Is there a typo where RW wrote r3c4 instead of r2c4 ???

ha -- I glossed over the typo, then forgot about it, and then quoted it for posterity
gsf
2014 Supporter

Posts: 7306
Joined: 21 September 2005
Location: NJ USA

RW wrote:Apparently it then might make a difference and this could provide an approach to finding grids with low/high clue puzzles. First construct grids with high/low numbers of fully entwined pairs, then scan those grids for puzzles. I'm also wondering if the entwined grid would make it more likely for extremely difficult puzzles to appear. I tried applying the diagonal pattern to the grid you posted and this is how close I got:

Code: Select all
`5 . 2|. . 9|. . .. . .|. 1 .|. 5 .1 . .|4 . .|8 . .-----+-----+-----. . 8|. . 4|. . 7. 7 .|. 6 .|. . .2 . .|8 . .|3 . .-----+-----+-----. . 1|. . 2|. . 8. . .|. 3 .|. 1 .4 . .|. . 1|5 6 .`

Seems quite hard, maybe ravel can tell me how it compares to the other hard ones. If this isn't hard enough, I'm quite sure you can find a harder diagonal puzzle from the grid as this was my first (manual) try.

RW

This grid has quite many puzzles with a low number of clues (#17s: 29; #18s: 748 minimals found; #19s: >60000; #20s: >570000 minimals).
Of the puzzles with few clues in this grid, a quite high persentage solves easily. My (not so advanced) logical solver could solve roughly 98% of the 18s, 97.5% of the 19s, and 96% of the 20s. Strangely enough the rate of 'unsolvables' increases as the number of clues goes up.

Anyway, still got >20000 puzzles that were not solved. It's impossible to know which puzzles that are really hard (if any). Picked two of the 20s that stopped early:
Code: Select all
` *-----------* |5..|...|4..| |.4.|71.|..3| |...|.2.|...| |---+---+---| |...|..4|62.| |97.|...|...| |.1.|...|3.9| |---+---+---| |6..|..2|...| |...|.3.|.14| |...|...|...| *-----------*`
Code: Select all
` *-----------* |...|.8.|4..| |..9|.1.|25.| |...|...|..6| |---+---+---| |...|..4|6..| |97.|...|.8.| |.1.|..7|...| |---+---+---| |6..|..2|...| |7..|.3.|.1.| |4..|...|...| *-----------*`

Code: Select all
` *-----------* |..2|3..|.7.| |..9|...|..3| |...|.25|...| |---+---+---| |.58|..4|..7| |.7.|...|1..| |21.|..7|...| |---+---+---| |...|.4.|...| |..5|.38|.14| |4..|9.1|.6.| *-----------*`

If this last should show signs of hardness, there is still room for improvement, as there are 20000 more starters to try...

Here are also two prospective minimal 20s in the same grid:
500000400000710000000020000000004600070200000010000300090000030720030010480000500
500300070000710200030400000000000600070003005010000040600002008000000010400000500
Ocean

Posts: 442
Joined: 29 August 2005

Ocean wrote:Anyway, still got >20000 puzzles that were not solved. It's impossible to know which puzzles that are really hard (if any).

try this with my solver
Code: Select all
`sudoku -e 'R>=95000' -f'"%r" %v' puzzles.dat`

the 90000-99999(max) rating range is exponential, larger => harder
gsf
2014 Supporter

Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Code: Select all
` *-----------* |..2|3..|.7.| |..9|...|..3| |...|.25|...| |---+---+---| |.58|..4|..7| |.7.|...|1..| |21.|..7|...| |---+---+---| |...|.4.|...| |..5|.38|.14| |4..|9.1|.6.| *-----------*`

If this last should show signs of hardness, there is still room for improvement, as there are 20000 more starters to try...

I've just put it through my solver........ It is difficult. Actually it is VERY difficult. it was on the outer rim of logical algorithms my solver employs to solve....

I fthere is a room for improvemnt, then my solver will not be that happy

The hard puzzles that ravel collects are puzzles that forces my solver to guess. That is the level of difficulty you were so close of achieving.

tarek

tarek

Posts: 2908
Joined: 05 January 2006

Ocean,
none of the 5 puzzles needed 4 steps (for 4 of them the program stopped with 2 steps, for the 3rd one with 3). Can you use gsf's solver to extract the hardest 20 out of your 20000 ?
ravel

Posts: 998
Joined: 21 February 2006

Interesting analysis on the intertwining clues - This may be relevant to the maximum number of clues problem - in deciding which clues to leave out initially. [I leave out 17 of the 18 clues in the unavoidable set initially]

Code: Select all
`+---+---+---+|562|389|471||849|716|253||137|425|896|+---+---+---+|358|194|627||974|263|185||216|857|349|+---+---+---+|691|542|738||725|638|914||483|971|562|+---+---+---+ SF grid`

To clarify any confusion the SF grid that we are perhaps more familiar is this one [relabelled/boxrow swapped]
Code: Select all
`+---+---+---+|639|241|785||284|765|193||517|983|624|+---+---+---+|123|857|946||796|432|851||458|619|237|+---+---+---+|342|178|569||861|594|372||975|326|418|+---+---+---+ morphed SF grid`

I am glad that some are analysing a bit furthur.

It has 29 17-clue sudoku puzzles
It would appear to have more unavoidable sets than most grids
It would be logical to assume it had a higher total number of minimal puzzles in it - [This is difficult to prove as there are an incredible number of these in every grid] [It is difficult to generate a duplicate puzzle !]
Code: Select all
`from 1000000 generated grids from the canon grid puzzle size   puzzles in sample   duplicated puzzles in sample   28              50796                               0   29               6787                               0   30                455                               0   31                 16                               0Which means at the 28 clue level there may be at least 10^9 puzzles `

It also has several 34-clue minimals ! [which are difficult to solve]

It will be interesting to see those hard 20s from it

Condor looked at / enumerated templates for the number clues - which might be relevant in constructing a grid to specification ! - Certainly relevant to this thread -

The SFB grid [no 4 set unavoidables] may also have this intertwining feature - maybe more so ?
Code: Select all
`589732614621854379743916825835129746417568293296347158968471532372695481154283967`
coloin

Posts: 1749
Joined: 05 May 2005

gsf wrote:try this with my solver
Code: Select all
` sudoku -e 'R>=95000' -f'"%r" %v' puzzles.dat `

the 90000-99999(max) rating range is exponential, larger => harder

Thanks. I first tried this on the hard puzzles from ravel's list, to have a reference:

Two 7-steppers and one 10-stepper score highest on the gsf-rating (96641, 96327, 96088).
The the top of the list (10-stepper) has gsf-rating 95011.
About 70 with gsf-rating above 95000.
Rest of the list rates between 20000 and 95000, rather evenly distributed.

The rating is most stable for very high scores and very low scores (far below those on the hard list), while in between it can show peculiar behavior: Scrambled versions of the same puzzle may score very different. The implication of this is that in the intermediate range (say from 20000 to 90000) the rating is less reliable.

tarek wrote:I've just put it through my solver........ It is difficult. Actually it is VERY difficult. it was on the outer rim of logical algorithms my solver employs to solve....

If there is a room for improvemnt, then my solver will not be that happy

The hard puzzles that ravel collects are puzzles that forces my solver to guess. That is the level of difficulty you were so close of achieving.

Thanks for the evaluation! Not sure yet what comes out of the sf-grid in the end. But would be interesting if you tried to solve some of the puzzles with high gsf-rating at the end of this post.

ravel wrote:Ocean,
none of the 5 puzzles needed 4 steps (for 4 of them the program stopped with 2 steps, for the 3rd one with 3). Can you use gsf's solver to extract the hardest 20 out of your 20000 ?

Thanks again. I am not able to do the gsf-rating on all the 20000 unsolved puzzles from the sf-grid at the moment (cannot do effective file transfer during the weekend). Using cut/paste I could transfer 5% of them for a preliminary analysis.

From a small sample set (about 5%) these are the five top scores:
Code: Select all
`#85878, 56...94......1.2...3............46...7.....8....8.....6.15....8....3..1.4...7....85138, 5.23..4....9.1...3....2.........46...7.....8..1...7...6............3..1.48....5..76180, 56....4.....71...3....2..9......46...7.......21.......69...2....2..3..1.4........67188, 5.2.8.4.....71...3......8.......46...7.26.....1.......6...42..........1.4..9.....63612, 562......8..71...3...4..........46...7......5.1....3.96....2.......3..1.4........#`

The rating is not stable as scrambled puzzles score differently (typically lower scores for these who sort first).

These two puzzles are reworked from the top rated, and according to my solver they have to be more difficult. But they rate lower on the gsf-rating:
Code: Select all
`57488, 5....94...4..1.2...3....8.......46...74....8...68.....691.42.......3..1...3.7....51521, 56...9.......1.2...3............46...7.....8....8...4.6.15....8....3..144...7....`
Will look at the rest of the puzzles when I'm able to transfer the whole dataset.

Here are (maybe) some more prospective candidates for the 'hard list': The highest gsf-ratings of the full-symmetric 20s. Their ratings compare as #3, #22, #32, #35, #35, etc. among the 'hard ones' (ravel's list).
Code: Select all
`#96128, ..1...2...3..4..5.2.......6...1.5....7.....8....9.3...6.......1.5..3..7...2...9..95259, ..1...2...2..3..1.4.......5...4.6....7.....8....9.5...5.......9.8..1..7...3...6..95177, ..........1..2..3...34.56....4...5...7.....8...6...4....56.89...8..7..1..........95157, ..1...2...2..3..4.5.......6...5.6....1.....7....8.9...8.......5.7..4..1...4...3..95153, ..1...2...2..3..4.5.......6...2.6....7.....8....3.5...6.......1.3..9..7...9...5..95153, ..1...2...2..3..4.5.......6...2.5....7.....8....3.6...6.......1.3..9..7...9...5..95142, ..........1.....2...34.56....4.1.5.....7.8.....6.9.7....53.64...9.....8..........95136, ..1...2...2..3..4.5.......6...4.7....8.....7....6.5...6.......1.4..2..8...9...5..95136, ..1...2...2..3..4.5.......6...4.7....8.....7....5.6...6.......1.4..2..8...9...5..95083, ..1...2...3..4..5.6.......7...1.2....8.....9....3.8...9.......1.7..5..8...2...6..95077, ..1...2...2..3..4.5.......6...4.7....8.....7....6.5...9.......5.3..9..8...6...1..95077, ..1...2...2..3..4.5.......6...4.7....8.....7....5.6...9.......5.3..9..8...6...1..95060, ..1...2...3..4..5.6.......7...1.2....8.....9....5.9...2.......8.9..3..6...7...1..95054, ..1...2...3..4..5.6.......7...1.3....8.....9....5.7...7.......1.9..5..4...2...8..95051, ..1...2...2..3..4.5.......6...1.7....5.....7....6.2...6.......1.7..8..3...4...9..95030, ..1...2...2..3..4.5.......6...1.5....7.....8....2.6...6.......1.8..4..3...9...7..95022, ..1...2...2..3..4.5.......6...1.7....4.....8....6.2...6.......1.3..9..7...8...5..95022, ..1...2...2..3..4.5.......6...1.2....4.....7....6.8...6.......1.3..9..8...7...5..95009, ..1...2...3..4..5.6.......7...1.3....8.....3....5.6...7.......1.5..9..8...2...6..95008, ..1...2...2..3..4.5.......6...4.7....7.....8....6.5...6.......1.8..1..7...9...5..95008, ..1...2...2..3..4.5.......6...4.7....7.....8....5.6...6.......1.8..1..7...9...5..95006, ..1...2...2..3..1.4.......5...4.6....7.....8....9.5...3.......4.8..1..7...2...9..95001, ..1...2...2..3..4.5.......6...4.7....4.....8....6.5...6.......1.8..2..3...9...5..95001, ..1...2...2..3..4.5.......6...4.7....4.....8....5.6...6.......1.8..2..3...9...5..#`
Ocean

Posts: 442
Joined: 29 August 2005

Ocean wrote:The rating is most stable for very high scores and very low scores (far below those on the hard list), while in between it can show peculiar behavior: Scrambled versions of the same puzzle may score very different. The implication of this is that in the intermediate range (say from 20000 to 90000) the rating is less reliable.

in theory the -B option (batch moves) should wash the scrambling affect
I don't have a scrambler yet -- that would be a good sanity check
so try -B on a couple of the equivalent puzzles with the solver posted today
gsf
2014 Supporter

Posts: 7306
Joined: 21 September 2005
Location: NJ USA

If you look at the top hardest 3 from Ravels list, the sudoku from colins and one trivial sudoku I selected you get the following statistics on how entwined:

Code: Select all
`Ravels rating:   top1  top2  top3 coloin trivial ran2fully entwinded   20    24    29    28     22     25 4-permuteable    13     7     6     8      9     10 8-permuteable     2     2     1     0      5      116-permuteable     1     3     0     0      0      0`

The entwinded values of top1 and coloin is extracted from RW's post. I have analyzed top2 and top3 manually, and they have the solution grids:

914378526658912734327564981289137465463295178571846392846753219792481653135629847
648297531379451682251368497562184973793526814814739256925613748436875129187942365

The trivial sudoku is just an easy sudoku I grapped with this solution:
638971524429835671751462893962743185573128946184659237345286719896317452217594368

Then I have analyzed the random solution "Ran2" from coloin.

Unfortunately, I have not seen any signs yet, that a high number of fully entwinded pairs should be a better start grid for producing hard sudokus.

It should be nice to know how some random sulution grids perform on this scale of entwinded pairs. If they are very different, this metod could be helpfull in a pre-evaluation of possible solution grids. Can anyone produce a few true random solution grids?

The top3 has an even highed count of fully entwinded. How does it perform on 17-clue puzzles?

/Viggo
Last edited by Viggo on Sun Jun 04, 2006 8:20 am, edited 2 times in total.
Viggo

Posts: 60
Joined: 21 April 2006

Ocean and I did a small study on up to 10 randomly constructed/chosen grids from here

Code: Select all
`Ran1 - 123749568456183927789625431238567149561498273947231685874952316615374892392816754Ran2 - 123849567456217839789356124642598713971463285538172946397681452865724391214935678Ran3 - 146328975785196342329457861493561728218973654657284139564839217832715496971642583Ran4 - 347981256582476193169523874896245317754318962213769485925137648478692531631854729Ran5 - 325681974641379258789425613134968725962537841857142396416293587273854169598716432Ran6 - 416235987275689431983147625391728546847516392562394718159873264624951873738462159Ran7 - 856923714731456298492817356348291675925764183167385429273548961684179532519632847Ran8 - 861953274395274618427681593249817356136425789578396421612749835753168942984532167Ran9 - 318924576596173824274586319785631942143259687962847153431768295659412738827395461Ran10- 241589763675143982839267514467832159593614278182975346718326495926451837354798621`

I have run clue counts on all of them
and tried to establish the minimum number of clues for most
coloin

Posts: 1749
Joined: 05 May 2005

Ocean, thanks for the interesting analysis. It also shows, how hard it is to define and evaluate super hard puzzles.
From the last list with 24 puzzles, numbers 2 (4 steps) and 19 (5 steps) are for my list.
ravel

Posts: 998
Joined: 21 February 2006

ravel wrote:Ocean, thanks for the interesting analysis. It also shows, how hard it is to define and evaluate super hard puzzles.
From the last list with 24 puzzles, numbers 2 (4 steps) and 19 (5 steps) are for my list.

these puzzles required guessing in my solver
the ratings for puzzles requiring guesses takes into account the distribution
of backdoors over the number of cells with minimum number of candidates
at the point where guessing was required, the idea being, most backtrack
solvers will pick the cells with least candidates to guess on

if there are few backdoors or if the backdoors are in cells with larger
#candidates then the rating is higher because the first guesses will miss
the backdoors -- this is a predictive measure on tree search behavior so its
not 100%, and it doesn't take into account lucky or smarter guesses

here's the puzzle# from above with rating, #backdoors and actual #guesses
with batching (-B) on -- this shows that batching doesn't help for puzzles
requiring guessing
Code: Select all
`# sudoku -B -f'%2n %6(rating)x %(magic)x %2(guess)x' r13.dat | sort --stable -n -k2,2r -k1,1 1  96128 1 21 2  95258 3 28 3  95165 1  6 4  95157 4 28 5  95150 1  6 6  95150 1  6 8  95076 3 11 9  95076 3 1110  95068 1  411  95062 1  712  95062 1  713  95057 1  414  95051 4 1415  95048 1  416  95016 1  519  95006 2 1120  95005 2  821  95005 2  822  95004 1  717  92351 1  318  92351 1  323  83124 2  724  83124 2  7 7  81052 2  1`

so it does a fair job on puzzles requiring many guesses but not so well
on those requiring few, although overall a low #backdoors is probably
a good first estimate on difficulty
gsf
2014 Supporter

Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Thank you Viggo for analyzing the hardest puzzles, I was about to do it this morning but you had saved me the time!

Viggo wrote:Unfortunately, I have not seen any signs yet, that a high number of fully entwinded pairs should be a better start grid for producing hard sudokus.

Neither have I... and as the SF grid also has several minimal 34s, it doesn't add up with the theory that low numbers of fully entwined pairs would be more likely to give high clue minimals...

Your random picked puzzle has 22 fully entwined pairs, and none of the puzzles you mentioned have less than 20, It would be interesting to see some statistics over a larger sample that would tell us what is normal.

I did another manual try to construct a grid with few fully entwined pairs:

Code: Select all
` *-----------* |123|469|785| |456|187|932| |789|253|164| |---+---+---| |371|694|528| |698|512|347| |542|378|691| |---+---+---| |215|946|873| |864|735|219| |937|821|456| *-----------*2-permutable   94-permutable   178-permutable   1016-permutable  0`

Does this show any unusual characteristics in your analyzers? I think you could get the number of fully entwined pairs down to zero, but that might require some heavier computing.

Ocean wrote:This grid has quite many puzzles with a low number of clues (#17s: 29; #18s: 748 minimals found; #19s: >60000; #20s: >570000 minimals). Of the puzzles with few clues in this grid, a quite high persentage solves easily.

That's expected, examining ravel's list shows that most of the known extremely hard puzzles have 23-25 clues. I think the "magic ingredient" of the toughest puzzles is some kind of complexity that rarely can be described by few clues (so far no example of extremely hard 17s or 18s).

RW
RW
2010 Supporter

Posts: 1000
Joined: 16 March 2006

I have added the results on how entwinded one of the random solution grids are in my post above. How does the top no. 3 perform on 17-clue puzzles?

/Viggo
Viggo

Posts: 60
Joined: 21 April 2006

coloin wrote:The SFB grid [no 4 set unavoidables] may also have this intertwining feature - maybe more so ?

Code: Select all
`589732614621854379743916825835129746417568293296347158968471532372695481154283967 `

We have a winner! In this grid ALL pairs are fully entwined. Question still remains if it is possible to construct a grid with NO such pairs.

Another question that can be asked is 'What is the maximum number of 2-digit unavoidable sets in a grid'. The minimum is 36 (as in the SFB grid), the theoretical maximum would be 36*4=144 (all pairs 16-permutable), but that should be impossible. The puzzle I made with few entwined pairs has 9*1+17*2+10*3=73, which is by far the greatest number in any puzzle mentioned so far in this thread, just over half of the theoretical maximum.

RW
RW
2010 Supporter

Posts: 1000
Joined: 16 March 2006

PreviousNext