The New Sudoku Players' Forum

by **gsf** » Sat Jun 03, 2006 1:48 am

daj95376 wrote:Is there a typo where RW wrote r3c4 instead of r2c4 ???

ha -- I glossed over the typo, then forgot about it, and then quoted it for posterity

by **Ocean** » Sat Jun 03, 2006 11:40 am

RW wrote:Apparently it then might make a difference and this could provide an approach to finding grids with low/high clue puzzles. First construct grids with high/low numbers of fully entwined pairs, then scan those grids for puzzles. I'm also wondering if the entwined grid would make it more likely for extremely difficult puzzles to appear. I tried applying the diagonal pattern to the grid you posted and this is how close I got:

Code: Select all
5 . 2|. . 9|. . . . . .|. 1 .|. 5 . 1 . .|4 . .|8 . . -----+-----+----- . . 8|. . 4|. . 7 . 7 .|. 6 .|. . . 2 . .|8 . .|3 . . -----+-----+----- . . 1|. . 2|. . 8 . . .|. 3 .|. 1 . 4 . .|. . 1|5 6 .

Seems quite hard, maybe ravel can tell me how it compares to the other hard ones. If this isn't hard enough, I'm quite sure you can find a harder diagonal puzzle from the grid as this was my first (manual) try.

RW

This grid has quite many puzzles with a low number of clues (#17s: 29; #18s: 748 minimals found; #19s: >60000; #20s: >570000 minimals).
Of the puzzles with few clues in this grid, a quite high persentage solves easily. My (not so advanced) logical solver could solve roughly 98% of the 18s, 97.5% of the 19s, and 96% of the 20s. Strangely enough the rate of 'unsolvables' increases as the number of clues goes up.

Anyway, still got >20000 puzzles that were not solved. It's impossible to know which puzzles that are really hard (if any). Picked two of the 20s that stopped early:

Code: Select all: *-----------* |5..|...|4..| |.4.|71.|..3| |...|.2.|...| |---+---+---| |...|..4|62.| |97.|...|...| |.1.|...|3.9| |---+---+---| |6..|..2|...| |...|.3.|.14| |...|...|...| *-----------*

Code: Select all: *-----------* |...|.8.|4..| |..9|.1.|25.| |...|...|..6| |---+---+---| |...|..4|6..| |97.|...|.8.| |.1.|..7|...| |---+---+---| |6..|..2|...| |7..|.3.|.1.| |4..|...|...| *-----------*

I am also curious about this one, created by manually picking an unsolvable and gradually making it harder.

Code: Select all: *-----------* |..2|3..|.7.| |..9|...|..3| |...|.25|...| |---+---+---| |.58|..4|..7| |.7.|...|1..| |21.|..7|...| |---+---+---| |...|.4.|...| |..5|.38|.14| |4..|9.1|.6.| *-----------*

If this last should show signs of hardness, there is still room for improvement, as there are 20000 more starters to try...

Here are also two prospective minimal 20s in the same grid:
500000400000710000000020000000004600070200000010000300090000030720030010480000500
500300070000710200030400000000000600070003005010000040600002008000000010400000500

by **gsf** » Sat Jun 03, 2006 12:37 pm

Ocean wrote:Anyway, still got >20000 puzzles that were not solved. It's impossible to know which puzzles that are really hard (if any).

try this with my solver

Code: Select all: sudoku -e 'R>=95000' -f'"%r" %v' puzzles.dat

the 90000-99999(max) rating range is exponential, larger => harder

by **tarek** » Sat Jun 03, 2006 1:51 pm

Ocean wrote:I am also curious about this one, created by manually picking an unsolvable and gradually making it harder.
Code: Select all
*-----------* |..2|3..|.7.| |..9|...|..3| |...|.25|...| |---+---+---| |.58|..4|..7| |.7.|...|1..| |21.|..7|...| |---+---+---| |...|.4.|...| |..5|.38|.14| |4..|9.1|.6.| *-----------*

If this last should show signs of hardness, there is still room for improvement, as there are 20000 more starters to try...

I've just put it through my solver........ It is difficult. Actually it is VERY difficult. it was on the outer rim of logical algorithms my solver employs to solve....

I fthere is a room for improvemnt, then my solver will not be that happy

The hard puzzles that ravel collects are puzzles that forces my solver to guess. That is the level of difficulty you were so close of achieving.

tarek

by **ravel** » Sat Jun 03, 2006 1:51 pm

Ocean,
none of the 5 puzzles needed 4 steps (for 4 of them the program stopped with 2 steps, for the 3rd one with 3). Can you use gsf's solver to extract the hardest 20 out of your 20000 ?

by **coloin** » Sat Jun 03, 2006 2:02 pm

Interesting analysis on the intertwining clues - This may be relevant to the maximum number of clues problem - in deciding which clues to leave out initially. [I leave out 17 of the 18 clues in the unavoidable set initially]

The grid clipped from the pseudopuzzles thread was this one

Code: Select all: +---+---+---+ |562|389|471| |849|716|253| |137|425|896| +---+---+---+ |358|194|627| |974|263|185| |216|857|349| +---+---+---+ |691|542|738| |725|638|914| |483|971|562| +---+---+---+ SF grid

To clarify any confusion the SF grid that we are perhaps more familiar is this one [relabelled/boxrow swapped]

Code: Select all: +---+---+---+ |639|241|785| |284|765|193| |517|983|624| +---+---+---+ |123|857|946| |796|432|851| |458|619|237| +---+---+---+ |342|178|569| |861|594|372| |975|326|418| +---+---+---+ morphed SF grid

I am glad that some are analysing a bit furthur.

It has 29 17-clue sudoku puzzles
It would appear to have more unavoidable sets than most grids
It would be logical to assume it had a higher total number of minimal puzzles in it - [This is difficult to prove as there are an incredible number of these in every grid] [It is difficult to generate a duplicate puzzle !]

Code: Select all: from 1000000 generated grids from the canon grid puzzle size puzzles in sample duplicated puzzles in sample 28 50796 0 29 6787 0 30 455 0 31 16 0 Which means at the 28 clue level there may be at least 10^9 puzzles

It also has several 34-clue minimals ! [which are difficult to solve]

It will be interesting to see those hard 20s from it

Condor looked at / enumerated templates for the number clues - which might be relevant in constructing a grid to specification ! - Certainly relevant to this thread -

The SFB grid [no 4 set unavoidables] may also have this intertwining feature - maybe more so ?

Code: Select all: 589732614621854379743916825835129746417568293296347158968471532372695481154283967

by **Ocean** » Sat Jun 03, 2006 10:28 pm

gsf wrote:try this with my solver
Code: Select all
sudoku -e 'R>=95000' -f'"%r" %v' puzzles.dat

the 90000-99999(max) rating range is exponential, larger => harder

Thanks. I first tried this on the hard puzzles from ravel's list, to have a reference:

Two 7-steppers and one 10-stepper score highest on the gsf-rating (96641, 96327, 96088).
The the top of the list (10-stepper) has gsf-rating 95011.
About 70 with gsf-rating above 95000.
Rest of the list rates between 20000 and 95000, rather evenly distributed.

The rating is most stable for very high scores and very low scores (far below those on the hard list), while in between it can show peculiar behavior: Scrambled versions of the same puzzle may score very different. The implication of this is that in the intermediate range (say from 20000 to 90000) the rating is less reliable.

tarek wrote:I've just put it through my solver........ It is difficult. Actually it is VERY difficult. it was on the outer rim of logical algorithms my solver employs to solve....

If there is a room for improvemnt, then my solver will not be that happy

The hard puzzles that ravel collects are puzzles that forces my solver to guess. That is the level of difficulty you were so close of achieving.

Thanks for the evaluation! Not sure yet what comes out of the sf-grid in the end. But would be interesting if you tried to solve some of the puzzles with high gsf-rating at the end of this post.

ravel wrote:Ocean,
none of the 5 puzzles needed 4 steps (for 4 of them the program stopped with 2 steps, for the 3rd one with 3). Can you use gsf's solver to extract the hardest 20 out of your 20000 ?

Thanks again. I am not able to do the gsf-rating on all the 20000 unsolved puzzles from the sf-grid at the moment (cannot do effective file transfer during the weekend). Using cut/paste I could transfer 5% of them for a preliminary analysis.

From a small sample set (about 5%) these are the five top scores:

Code: Select all: # 85878, 56...94......1.2...3............46...7.....8....8.....6.15....8....3..1.4...7.... 85138, 5.23..4....9.1...3....2.........46...7.....8..1...7...6............3..1.48....5.. 76180, 56....4.....71...3....2..9......46...7.......21.......69...2....2..3..1.4........ 67188, 5.2.8.4.....71...3......8.......46...7.26.....1.......6...42..........1.4..9..... 63612, 562......8..71...3...4..........46...7......5.1....3.96....2.......3..1.4........ #

The rating is not stable as scrambled puzzles score differently (typically lower scores for these who sort first).

These two puzzles are reworked from the top rated, and according to my solver they have to be more difficult. But they rate lower on the gsf-rating:

Code: Select all: 57488, 5....94...4..1.2...3....8.......46...74....8...68.....691.42.......3..1...3.7.... 51521, 56...9.......1.2...3............46...7.....8....8...4.6.15....8....3..144...7....

Will look at the rest of the puzzles when I'm able to transfer the whole dataset.

Here are (maybe) some more prospective candidates for the 'hard list': The highest gsf-ratings of the full-symmetric 20s. Their ratings compare as #3, #22, #32, #35, #35, etc. among the 'hard ones' (ravel's list).

Code: Select all: # 96128, ..1...2...3..4..5.2.......6...1.5....7.....8....9.3...6.......1.5..3..7...2...9.. 95259, ..1...2...2..3..1.4.......5...4.6....7.....8....9.5...5.......9.8..1..7...3...6.. 95177, ..........1..2..3...34.56....4...5...7.....8...6...4....56.89...8..7..1.......... 95157, ..1...2...2..3..4.5.......6...5.6....1.....7....8.9...8.......5.7..4..1...4...3.. 95153, ..1...2...2..3..4.5.......6...2.6....7.....8....3.5...6.......1.3..9..7...9...5.. 95153, ..1...2...2..3..4.5.......6...2.5....7.....8....3.6...6.......1.3..9..7...9...5.. 95142, ..........1.....2...34.56....4.1.5.....7.8.....6.9.7....53.64...9.....8.......... 95136, ..1...2...2..3..4.5.......6...4.7....8.....7....6.5...6.......1.4..2..8...9...5.. 95136, ..1...2...2..3..4.5.......6...4.7....8.....7....5.6...6.......1.4..2..8...9...5.. 95083, ..1...2...3..4..5.6.......7...1.2....8.....9....3.8...9.......1.7..5..8...2...6.. 95077, ..1...2...2..3..4.5.......6...4.7....8.....7....6.5...9.......5.3..9..8...6...1.. 95077, ..1...2...2..3..4.5.......6...4.7....8.....7....5.6...9.......5.3..9..8...6...1.. 95060, ..1...2...3..4..5.6.......7...1.2....8.....9....5.9...2.......8.9..3..6...7...1.. 95054, ..1...2...3..4..5.6.......7...1.3....8.....9....5.7...7.......1.9..5..4...2...8.. 95051, ..1...2...2..3..4.5.......6...1.7....5.....7....6.2...6.......1.7..8..3...4...9.. 95030, ..1...2...2..3..4.5.......6...1.5....7.....8....2.6...6.......1.8..4..3...9...7.. 95022, ..1...2...2..3..4.5.......6...1.7....4.....8....6.2...6.......1.3..9..7...8...5.. 95022, ..1...2...2..3..4.5.......6...1.2....4.....7....6.8...6.......1.3..9..8...7...5.. 95009, ..1...2...3..4..5.6.......7...1.3....8.....3....5.6...7.......1.5..9..8...2...6.. 95008, ..1...2...2..3..4.5.......6...4.7....7.....8....6.5...6.......1.8..1..7...9...5.. 95008, ..1...2...2..3..4.5.......6...4.7....7.....8....5.6...6.......1.8..1..7...9...5.. 95006, ..1...2...2..3..1.4.......5...4.6....7.....8....9.5...3.......4.8..1..7...2...9.. 95001, ..1...2...2..3..4.5.......6...4.7....4.....8....6.5...6.......1.8..2..3...9...5.. 95001, ..1...2...2..3..4.5.......6...4.7....4.....8....5.6...6.......1.8..2..3...9...5.. #

by **gsf** » Sun Jun 04, 2006 12:54 am

Ocean wrote:The rating is most stable for very high scores and very low scores (far below those on the hard list), while in between it can show peculiar behavior: Scrambled versions of the same puzzle may score very different. The implication of this is that in the intermediate range (say from 20000 to 90000) the rating is less reliable.

in theory the -B option (batch moves) should wash the scrambling affect
I don't have a scrambler yet -- that would be a good sanity check
so try -B on a couple of the equivalent puzzles with the solver posted today

by **Viggo** » Sun Jun 04, 2006 6:03 am

If you look at the top hardest 3 from Ravels list, the sudoku from colins and one trivial sudoku I selected you get the following statistics on how entwined:

Code: Select all: Ravels rating: top1 top2 top3 coloin trivial ran2 fully entwinded 20 24 29 28 22 25 4-permuteable 13 7 6 8 9 10 8-permuteable 2 2 1 0 5 1 16-permuteable 1 3 0 0 0 0

The entwinded values of top1 and coloin is extracted from RW's post. I have analyzed top2 and top3 manually, and they have the solution grids:

914378526658912734327564981289137465463295178571846392846753219792481653135629847
648297531379451682251368497562184973793526814814739256925613748436875129187942365

The trivial sudoku is just an easy sudoku I grapped with this solution:
638971524429835671751462893962743185573128946184659237345286719896317452217594368

Then I have analyzed the random solution "Ran2" from coloin.

Unfortunately, I have not seen any signs yet, that a high number of fully entwinded pairs should be a better start grid for producing hard sudokus.

It should be nice to know how some random sulution grids perform on this scale of entwinded pairs. If they are very different, this metod could be helpfull in a pre-evaluation of possible solution grids. Can anyone produce a few true random solution grids?

The top3 has an even highed count of fully entwinded. How does it perform on 17-clue puzzles?

/Viggo

by **coloin** » Sun Jun 04, 2006 7:48 am

Ocean and I did a small study on up to 10 randomly constructed/chosen grids from here

Code: Select all: Ran1 - 123749568456183927789625431238567149561498273947231685874952316615374892392816754 Ran2 - 123849567456217839789356124642598713971463285538172946397681452865724391214935678 Ran3 - 146328975785196342329457861493561728218973654657284139564839217832715496971642583 Ran4 - 347981256582476193169523874896245317754318962213769485925137648478692531631854729 Ran5 - 325681974641379258789425613134968725962537841857142396416293587273854169598716432 Ran6 - 416235987275689431983147625391728546847516392562394718159873264624951873738462159 Ran7 - 856923714731456298492817356348291675925764183167385429273548961684179532519632847 Ran8 - 861953274395274618427681593249817356136425789578396421612749835753168942984532167 Ran9 - 318924576596173824274586319785631942143259687962847153431768295659412738827395461 Ran10- 241589763675143982839267514467832159593614278182975346718326495926451837354798621

I have run clue counts on all of them
and tried to establish the minimum number of clues for most

by **ravel** » Sun Jun 04, 2006 10:40 am

Ocean, thanks for the interesting analysis. It also shows, how hard it is to define and evaluate super hard puzzles.
From the last list with 24 puzzles, numbers 2 (4 steps) and 19 (5 steps) are for my list.

by **gsf** » Sun Jun 04, 2006 11:34 am

ravel wrote:Ocean, thanks for the interesting analysis. It also shows, how hard it is to define and evaluate super hard puzzles.
From the last list with 24 puzzles, numbers 2 (4 steps) and 19 (5 steps) are for my list.

these puzzles required guessing in my solver
the ratings for puzzles requiring guesses takes into account the distribution
of backdoors over the number of cells with minimum number of candidates
at the point where guessing was required, the idea being, most backtrack
solvers will pick the cells with least candidates to guess on

if there are few backdoors or if the backdoors are in cells with larger
#candidates then the rating is higher because the first guesses will miss
the backdoors -- this is a predictive measure on tree search behavior so its
not 100%, and it doesn't take into account lucky or smarter guesses

here's the puzzle# from above with rating, #backdoors and actual #guesses
with batching (-B) on -- this shows that batching doesn't help for puzzles
requiring guessing

Code: Select all: # sudoku -B -f'%2n %6(rating)x %(magic)x %2(guess)x' r13.dat | sort --stable -n -k2,2r -k1,1 1 96128 1 21 2 95258 3 28 3 95165 1 6 4 95157 4 28 5 95150 1 6 6 95150 1 6 8 95076 3 11 9 95076 3 11 10 95068 1 4 11 95062 1 7 12 95062 1 7 13 95057 1 4 14 95051 4 14 15 95048 1 4 16 95016 1 5 19 95006 2 11 20 95005 2 8 21 95005 2 8 22 95004 1 7 17 92351 1 3 18 92351 1 3 23 83124 2 7 24 83124 2 7 7 81052 2 1

so it does a fair job on puzzles requiring many guesses but not so well
on those requiring few, although overall a low #backdoors is probably
a good first estimate on difficulty

by RW » Sun Jun 04, 2006 12:17 pm

Thank you Viggo for analyzing the hardest puzzles, I was about to do it this morning but you had saved me the time!

Viggo wrote:Unfortunately, I have not seen any signs yet, that a high number of fully entwinded pairs should be a better start grid for producing hard sudokus.

Neither have I... and as the SF grid also has several minimal 34s, it doesn't add up with the theory that low numbers of fully entwined pairs would be more likely to give high clue minimals...

Your random picked puzzle has 22 fully entwined pairs, and none of the puzzles you mentioned have less than 20, It would be interesting to see some statistics over a larger sample that would tell us what is normal.

I did another manual try to construct a grid with few fully entwined pairs:

Code: Select all: *-----------* |123|469|785| |456|187|932| |789|253|164| |---+---+---| |371|694|528| |698|512|347| |542|378|691| |---+---+---| |215|946|873| |864|735|219| |937|821|456| *-----------* 2-permutable 9 4-permutable 17 8-permutable 10 16-permutable 0

Does this show any unusual characteristics in your analyzers? I think you could get the number of fully entwined pairs down to zero, but that might require some heavier computing.

Ocean wrote:This grid has quite many puzzles with a low number of clues (#17s: 29; #18s: 748 minimals found; #19s: >60000; #20s: >570000 minimals). Of the puzzles with few clues in this grid, a quite high persentage solves easily.

That's expected, examining ravel's list shows that most of the known extremely hard puzzles have 23-25 clues. I think the "magic ingredient" of the toughest puzzles is some kind of complexity that rarely can be described by few clues (so far no example of extremely hard 17s or 18s).

RW

by **Viggo** » Sun Jun 04, 2006 12:33 pm

I have added the results on how entwinded one of the random solution grids are in my post above. How does the top no. 3 perform on 17-clue puzzles?

/Viggo

by RW » Sun Jun 04, 2006 1:51 pm

coloin wrote:The SFB grid [no 4 set unavoidables] may also have this intertwining feature - maybe more so ?

Code: Select all
589732614621854379743916825835129746417568293296347158968471532372695481154283967

We have a winner! In this grid ALL pairs are fully entwined. Question still remains if it is possible to construct a grid with NO such pairs.

Another question that can be asked is 'What is the maximum number of 2-digit unavoidable sets in a grid'. The minimum is 36 (as in the SFB grid), the theoretical maximum would be 36*4=144 (all pairs 16-permutable), but that should be impossible. The puzzle I made with few entwined pairs has 9*1+17*2+10*3=73, which is by far the greatest number in any puzzle mentioned so far in this thread, just over half of the theoretical maximum.

RW

The New Sudoku Players' Forum

Structures of the solution grid