The hardest sudokus (new thread)

Everything about Sudoku that doesn't fit in one of the other sections

Re: The hardest sudokus (new thread)

Postby mith » Mon Aug 21, 2023 11:35 pm

This is the current list of 898 puzzles which have singles expanded forums with T&E depth 3 and SER 11.8+:

google doc
mith
 
Posts: 996
Joined: 14 July 2020

Re: The hardest sudokus (new thread)

Postby mith » Tue Aug 22, 2023 12:04 am

Out of curiosity, I started rating these in the gsf minlex form. Already three with SER 11.6 in this form (vs. 11.8 for the gsf minlex expanded form). I wouldn't expect any to rate below 11.6 (which would confirm it as a good choice for cutoff) but we'll see.

The small deviation in the rating of different morphs in the high ratings belongs to the second category. This is something I already often had to write. In the low ratings, the search of a contradiction is exhaustive. In the very high ratings, it is not. As a consequence, if the order of the candidates in the search is changed, the process can be stopped with a different rating. This is what happens with different morphs. Which morph will give the highest rating is unpredictable;


Given improvements in computing power, I would suggest this voluntary limit category *could* be removed and puzzles above some threshold be re-rated accordingly. (I haven't looked at the serate code enough myself to have an idea how much of a task that would be, never mind whether it's worth the effort in the first place.)
mith
 
Posts: 996
Joined: 14 July 2020

Re: The hardest sudokus (new thread)

Postby champagne » Tue Aug 22, 2023 7:13 am

Hi mith,

I don't know what is the average rating time for these puzzles with so many clues, but in the 19-22 area, a puzzle rated around 11.7 could require hours of computation with serate. This has been the reason to create skfr having a quick pre rating for the interactive game.

Some members of the forum having a good practice of JAVA fixed some bugs in serate, but nobody, AFAIK, tried to reach a rating morph neutral although this oddity has been source of trouble in many case in the game.

I see how attractive is the fact to reach the "highest" possible serate rating, but for me, it remains a way to select puzzles with properties making them very hard to solve with a classical set of rules.

I am highly interested in the TH property breaking this recent family, less by the back doors. I have never seen a manual solver using a back door to solve a puzzle. And so far it seems that the T&E 3 property is limited to puzzles having the TH pattern,
champagne
2017 Supporter
 
Posts: 7453
Joined: 02 August 2007
Location: France Brittany

Re: The hardest sudokus (new thread)

Postby denis_berthier » Tue Aug 22, 2023 7:46 am

mith wrote:
The small deviation in the rating of different morphs in the high ratings belongs to the second category. This is something I already often had to write. In the low ratings, the search of a contradiction is exhaustive. In the very high ratings, it is not. As a consequence, if the order of the candidates in the search is changed, the process can be stopped with a different rating. This is what happens with different morphs. Which morph will give the highest rating is unpredictable;

Given improvements in computing power, I would suggest this voluntary limit category *could* be removed and puzzles above some threshold be re-rated accordingly. (I haven't looked at the serate code enough myself to have an idea how much of a task that would be, never mind whether it's worth the effort in the first place.)


In order to avoid misleading newbies into hopeless coding efforts, let it be clear that dealing with this problem on non-exhaustive search in SER or with any bug encountered in it, will not free SER of its possible variations with morphs. I've explained the reasons in a previous post: the absence of the confluence property for the set of rules defining a fixed SER rating.
denis_berthier
2010 Supporter
 
Posts: 4190
Joined: 19 June 2007
Location: Paris

Re: The hardest sudokus (new thread)

Postby m_b_metcalf » Tue Aug 22, 2023 7:54 am

champagne wrote:I am highly interested in the TH property breaking this recent family, less by the back doors. I have never seen a manual solver using a back door to solve a puzzle.


champagne,
Good to hear from you.

You're right. I tried to submit some backdoor puzzles to the Puzzles thread and the idea flopped here and here.

Regards,

Mike
User avatar
m_b_metcalf
2017 Supporter
 
Posts: 13618
Joined: 15 May 2006
Location: Berlin

Re: The hardest sudokus (new thread)

Postby Paquita » Tue Aug 22, 2023 10:06 am

A new batch of files, rated 11.7 or 11.6, it can be viewed here https://docs.google.com/document/d/e/2PACX-1vSqkqtbXbIWzRuHtnpz3W7L9cH96v3MKfk6s0LVekBDk3WwZ0EtmzB9m6O0oXrPUg57RbcGnSC_HwsJ/pub.

And another batch with two 11.8

Code: Select all
987.........96.8......5....74...32...93.4278..........47.....923.....4...28...37. ED=11.8/1.2/1.2
987.........96.8......5....8.........94.3278..3...42..4.....3..37.....92.28...47. ED=11.8/1.2/1.2
987......6...........85....4.32..1..2.1.453......1.......4.128.....8.4.3....23.15 ED=11.7/1.2/1.2
987......6...........85....4.32..1..2.1.453.....3........4.128.....8.4.3....23.15 ED=11.7/1.2/1.2
987......6...........85....4.32.81..2.1.453......1.......4.128.....23.15......4.3 ED=11.7/1.2/1.2
987......6...........85....4.32.81..2.1.453.....3........4.128.....23.15......4.3 ED=11.7/1.2/1.2
987......6...........85....4.32..1..2.1.453......1.....4..23.51....8.43......12.8 ED=11.7/1.2/1.2
987......6...........85....4.32..1..2.1.453.....3......4..23.51....8.43......12.8 ED=11.7/1.2/1.2
987......6...........85....4.32.81..2.1.453......1.....4..23.51.....12.8......43. ED=11.7/1.2/1.2
987......6...........85....4.32.81..2.1.453.....3......4..23.51.....12.8......43. ED=11.7/1.2/1.2
98.76.5..75..498.....8.....32.......1.....6...9.6.4......4..98.....87.45....5.7.6 ED=11.7/1.2/1.2
987......6..8.......5...4..5.32...1.1.238.5.4.......3....53.84....1....2....421.3 ED=11.7/1.2/1.2
987......6..8.......5...4..5.32...1.1.238.5.4......2.....53.84....1....2....421.3 ED=11.7/1.2/1.2
987......6..8.......5...4..5.32...811.23..54.........3...53...4...1.8.2.....4213. ED=11.7/1.2/1.2
987......6..8.......5...4..5.32...811.23..54.......2.....53...4...1.8.2.....4213. ED=11.7/1.2/1.2
987......6..85......4......3.62..1..1.2.653......1.......58.63.....23.51.....12.8 ED=11.7/1.2/1.2
987......6..85......4......3.62..1..1.2.653.....3........58.63.....23.51.....12.8 ED=11.7/1.2/1.2
987......6..85......4......3.62.81..1.2.653......1.......5..63.....23.51.....12.8 ED=11.7/1.2/1.2
987......6..85......4......3.62.81..1.2.653.....3........5..63.....23.51.....12.8 ED=11.7/1.2/1.2
9876.........958.......4...76........392..78...23..6..6.....29.2.8...3.7.93....68 ED=11.7/1.2/1.2
9876.........958.......4...76........932..78...23..6..6.....29.2.8...3.7.3.....68 ED=11.7/1.2/1.2
9876.........958.......4...76........932..78...23..6..6.....39.3.8...2.7.2.....68 ED=11.7/1.2/1.2
98.76.5..7.49..6....5.84...67..9....4.8.57.9..5...........487.........6........32 ED=11.7/1.2/1.2
98.76.5..7.59.4....64.8.9..69.8.....5.........475.6.8....4.76.........9........32 ED=11.7/1.2/1.2
98.76.5..7.49..6....5.48...67..9....4.8.57.9..5...........847.........6........32 ED=11.7/1.2/1.2
987......6..8...........5..4.32...811..3..45.......2.....43.......1.8.24....5213. ED=11.7/1.2/1.2
987......6..8...........5..4.32...1.1..38.4.5......2.....43.8.....1...42....521.3 ED=11.7/1.2/1.2
987......6..8...........5..4.32...811..3..45.......2...4..5213....43.......1.8.2. ED=11.7/1.2/1.2
987......6..8...........5..4.32...1.1..38.4.5......2...4..521.3...43.8.....1....2 ED=11.7/1.2/1.2
98.76.5..7.5.......468.......84.6.75..795...6....8......4.....3....94..........21 ED=11.7/1.2/1.2
98.76.5..7.5.4......68.....4.795..6...8..6.57....8......4....3.....94..........21 ED=11.7/1.2/1.2
98.76.5..75...98.......5...43.......2.....1.....6.1......95..86...1..7.5....8791. ED=11.6/1.2/1.2
98.76.5..75...98.....8.....43.......2.....1.....6.1......95..86...1..7.5....8791. ED=11.6/1.2/1.2
98.76.5..75...98.......5...43.......2.....1.....6.1.....795..86...1..7.5....8.91. ED=11.6/1.2/1.2
98.76.5..75...98.....8.....43.......2.....1.....6.1.....795..86...1..7.5....8.91. ED=11.6/1.2/1.2
98.76.5..75...98.......5...43.......2.....1...7.6.1......95..86...1.67.5....8.91. ED=11.6/1.2/1.2
98.76.5..75...98.....8.....43.......2.....1...7.6.1......95..86...1.67.5....8.91. ED=11.6/1.2/1.2
98.76.5..75..498.....8.....32.......1.....4...7.6.4......95..86....8.94......67.5 ED=11.6/1.2/1.2
98.76.54.75.9....8........75.....4.632........1.6........8...9....57.6.4....4987. ED=11.6/1.2/1.2
98.76.54.75.9....8......9..5.....4.632........1.6........8...9....57.6.4....4987. ED=11.6/1.2/1.2
98.76.5..7.4.......56.......9..8..5..6745......89.6.4....89.3.2....7...1......... ED=11.6/1.2/1.2
98.76.5..7.45.......6......5.89.6.4..9..8..5..674........89.3.2....7...1......... ED=11.6/1.2/1.2


Denis I am using your program to rate my previous 11.8 puzzles from august 13th. It takes a long time, I think I have too many print options enabled, have not figured that out yet. I am now rating them with the T&E(3) option, so far the ones I did are all T&E(BRT,3). But I wonder, if a puzzle is in T&E(2) or T&E(1), can it be rated by T&E(3)? In other words, should one try to solve the puzzle in T&E(1) first, if not solved, in T&E(2), if not solved still, in T&E(3)? I am not sure yet my ratings of T&E(BRT,3) are right. And is there an option to print to file? (if possible, to print just the rating, not all the intermediate steps).
Paquita
 
Posts: 132
Joined: 11 November 2018

Re: The hardest sudokus (new thread)

Postby denis_berthier » Tue Aug 22, 2023 10:32 am

Paquita wrote:Denis I am using your program to rate my previous 11.8 puzzles from august 13th. It takes a long time, I think I have too many print options enabled, have not figured that out yet. I am now rating them with the T&E(3) option, so far the ones I did are all T&E(BRT,3). But I wonder, if a puzzle is in T&E(2) or T&E(1), can it be rated by T&E(3)? In other words, should one try to solve the puzzle in T&E(1) first, if not solved, in T&E(2), if not solved still, in T&E(3)? I am not sure yet my ratings of T&E(BRT,3) are right. And is there an option to print to file? (if possible, to print just the rating, not all the intermediate steps).


Hi Paquita,
When you use T&E, the following options will avoid useless printings (and they will marginally spare some computation time):
Code: Select all
 (bind ?*print-actions* FALSE)
 (bind ?*print-levels* FALSE)
 (bind ?*print-main-levels* FALSE)
 (bind ?*print-solution* FALSE)
 (bind ?*print-hypothesis* FALSE)
 (bind ?*print-RS-after-Singles* FALSE)
 (bind ?*print-RS-after-whips[1]* FALSE)
 (bind ?*print-final-RS* FALSE)
 (bind ?*print-hypothesis* FALSE)


As with any other choice of "rules", T&E - be it depth 1, 2 or 3 - will give a solution if it is enough to solve the puzzle.
Which means that a puzzle solved with T&E(n) will also appear to be solved with T&E(n+1).

When you are solving series of puzzles, you can use functions like "solve-n-grids-from-text-file".
Those that are solved will appear in variable ?*solved-list*
It doesn't make the computations shorter, but it relieves you from the pain of monitoring them.

As I said before, my T&E procedure (and anything procedural in CSP-Rules) is slow, basically because procedural code in CLIPS is interpreted instead of compiled (this is not technically quite exact, but it's the simplest way of understanding what happens).

I'll soon (in a few days) release an improved version of:
- the T&E(n, n=1,2,3) that allows some gain in resolution time (30 to 50%) by allowing to use a precomputed solution (obtained e.g. by DFS) in order to avoid testing some candidates in it ;
- new "solve-file-knowing-solutions" functions that will allow to do this for full files of puzzles.


From a practical POV, if you know the SER of your puzzles is 11.6+, the highest likelihood is T&E(2) - unless you have some reason to think they may be in T&E(3). T&E(1) is totally unlikely.
Note also that it takes in general much more time to not-solve a puzzle than to solve it. In case of a doubt, it's safer to try the high value first.
.
denis_berthier
2010 Supporter
 
Posts: 4190
Joined: 19 June 2007
Location: Paris

Re: The hardest sudokus (new thread)

Postby mith » Tue Aug 22, 2023 2:32 pm

Here are the results of rating the 898 minimals linked previously: google sheet

denis_berthier wrote:
mith wrote:
The small deviation in the rating of different morphs in the high ratings belongs to the second category. This is something I already often had to write. In the low ratings, the search of a contradiction is exhaustive. In the very high ratings, it is not. As a consequence, if the order of the candidates in the search is changed, the process can be stopped with a different rating. This is what happens with different morphs. Which morph will give the highest rating is unpredictable;

Given improvements in computing power, I would suggest this voluntary limit category *could* be removed and puzzles above some threshold be re-rated accordingly. (I haven't looked at the serate code enough myself to have an idea how much of a task that would be, never mind whether it's worth the effort in the first place.)


In order to avoid misleading newbies into hopeless coding efforts, let it be clear that dealing with this problem on non-exhaustive search in SER or with any bug encountered in it, will not free SER of its possible variations with morphs. I've explained the reasons in a previous post: the absence of the confluence property for the set of rules defining a fixed SER rating.


Sure, I absolutely agree that it will not remove all morph dependencies for all puzzles. It will however improve the consistency in rating these +DFC puzzles specifically. (And again, debatable whether that's worth any effort at all.)

champagne wrote:Hi mith,

I don't know what is the average rating time for these puzzles with so many clues, but in the 19-22 area, a puzzle rated around 11.7 could require hours of computation with serate. This has been the reason to create skfr having a quick pre rating for the interactive game.


I should clarify that I am using the multithreading added in SukakuExplainer. Running this single threaded would of course take much longer for an individual puzzle (though I could rate a batch in parallel).

That said, my PC tends to rate these high clue count puzzles in under a minute on average (I have a Ryzen 9 16-core). I've started running it on the 70 11.8+ in ph2010 for comparison, here are all the 11.9s:

Code: Select all
[98.7.....7.....6....6.5.....4...5.3...79..5......2...1..85..9......1...4.....3.2.]     11.9    11.9    11.8    DCFC+DFC
time 121.67701625823975
[98.7.....6.....87...7.....5.4..3.5....65...9......2..1..86...5.....1.3.......4..2]     11.9    11.9    11.6    DCFC+DFC
time 63.65245199203491
[12..3....4....1.2...52..1..5..4..2......6..7......3..8.5....9....9.7..3......8..6]     11.9    11.9    11.3    DCFC+DFC
time 54.75340390205383
[.......39.....1..5..3.5.8....8.9...6.7...2...1..4.......9.8..5..2....6..4..7.....]     11.9    11.9    11.3    DCFC+DFC
time 141.90189957618713
[.2.4...8.....8...68....71..2..5...9..95.......4..3.........1..7..28...4.....6.3..]     11.9    11.9    9.9     DCFC+DFC
time 116.47096490859985
[........1....23.45..51..2....25...1..6...27..8...9......42....7.3...6...9...8....]     11.9    11.9    9.9     DLFC+DFC
time 70.52767658233643
[12.3.....4.5...6...7.....2.6..1..3....453.........8..9...45.1.........8......2..7]     11.9    11.9    2.6     DCFC+DFC
time 160.4379918575287
[5.6...7...1.3.....8...5.9.....1...2.....8.6.7.....2.4.7...9...6.3...42....5......]     11.9    1.2     1.2     DCFC+DFC
time 81.70554542541504
[..3..6.8....1..2......7...4..9..8.6..3..4...1.7.2.....3....5.....5...6..98.....5.]     11.9    1.2     1.2     DCFC+DFC
time 57.40283441543579
mith
 
Posts: 996
Joined: 14 July 2020

Re: The hardest sudokus (new thread)

Postby dobrichev » Tue Aug 22, 2023 2:45 pm

champagne wrote:Some members of the forum having a good practice of JAVA fixed some bugs in serate, but nobody, AFAIK, tried to reach a rating morph neutral although this oddity has been source of trouble in many case in the game.

At least I tried.

Essentially, the problem is when at some point several elimination techniques have exactly the same minimal rounded rating, calculated by the non-linear expressions spread in the code.
The first elimination step wins and determines later resolution path.
"First" is determined by some hash value and here the morphs play their role.

This could be fixed, when different possible techniques are internally rated differently, so that the techniques with the exactly same final rounded rating can be ordered by this hidden property.
Finally, if several techniques still can't be distinguished by their combination of rounded and hidden rating, then all they must be applied at once, before restarting the process in finding the next cheaper elimination.

Of course, there is no consensus on rating changes, and no volunteers to code the new rating rules.
dobrichev
2016 Supporter
 
Posts: 1862
Joined: 24 May 2010

Re: The hardest sudokus (new thread)

Postby mith » Tue Aug 22, 2023 3:43 pm

dobrichev wrote:
champagne wrote:Some members of the forum having a good practice of JAVA fixed some bugs in serate, but nobody, AFAIK, tried to reach a rating morph neutral although this oddity has been source of trouble in many case in the game.

At least I tried.

Essentially, the problem is when at some point several elimination techniques have exactly the same minimal rounded rating, calculated by the non-linear expressions spread in the code.
The first elimination step wins and determines later resolution path.
"First" is determined by some hash value and here the morphs play their role.

This could be fixed, when different possible techniques are internally rated differently, so that the techniques with the exactly same final rounded rating can be ordered by this hidden property.
Finally, if several techniques still can't be distinguished by their combination of rounded and hidden rating, then all they must be applied at once, before restarting the process in finding the next cheaper elimination.

Of course, there is no consensus on rating changes, and no volunteers to code the new rating rules.


This sounds like a different issue from the one champagne was talking about previously (of the contradiction search not being exhaustive). Or is this part of what is causing the search to not be exhaustive?

I'd be curious if you have an example of a puzzle where this hash "tiebreaker" is causing the puzzle to rate differently afterward. I would think it could never cause the rating to go up on its own - that is, if this tie due to rounding occurs at a 11.6 step, whichever step is applied should leave the alternative(s) at 11.6 or less - other than considerations like uniqueness techniques or somehow affecting the non-exhaustive nature of the search. But I definitely don't have a clear picture of what goes into the "length" calculation for chains.
mith
 
Posts: 996
Joined: 14 July 2020

Re: The hardest sudokus (new thread)

Postby champagne » Tue Aug 22, 2023 3:45 pm

dobrichev wrote:Of course, there is no consensus on rating changes, and no volunteers to code the new rating rules.


Hi mladen,

And this makes sense.
No player is interested in puzzles with high ratings if there is no "exotic" / quick way to solve it.
The only interest for an agreed rating of hard puzzles would be to define the hardest ones.
An I see no chance to have such a rating.
champagne
2017 Supporter
 
Posts: 7453
Joined: 02 August 2007
Location: France Brittany

Re: The hardest sudokus (new thread)

Postby champagne » Tue Aug 22, 2023 3:52 pm

mith wrote:
This sounds like a different issue from the one champagne was talking about previously (of the contradiction search not being exhaustive). Or is this part of what is causing the search to not be exhaustive?

I'd be curious if you have an example of a puzzle where this hash "tiebreaker" is causing the puzzle to rate differently afterward. I would think it could never cause the rating to go up on its own - that is, if this tie due to rounding occurs at a 11.6 step, whichever step is applied should leave the alternative(s) at 11.6 or less - other than considerations like uniqueness techniques or somehow affecting the non-exhaustive nature of the search. But I definitely don't have a clear picture of what goes into the "length" calculation for chains.


The truth is likely somewhere in the middle.
I am quite sure that the search is not exhaustive when you reach the nested levels, but I can have missed other sources of deviations in the morphs (and yes, the UR/UL priority can be another source of deviation, much bigger, but this is not so common in the very high ratings)

And I started the rating of one of your puzzles (likely golden nugget) to see what is the run time here
champagne
2017 Supporter
 
Posts: 7453
Joined: 02 August 2007
Location: France Brittany

Re: The hardest sudokus (new thread)

Postby mith » Tue Aug 22, 2023 4:28 pm

For the 70 puzzles, had an average of 91.56, max 235.05, min 35.08

The one that took the longest was
Code: Select all
6.......2.9.4...5...1...7...5..84.......2.......3.5.4.2.....6...3...9.8...7.....1;11.80;11.80;11.10;tax;coloin-04-10;13;21;


I'll try rating some later with the pattern game serate to compare.
mith
 
Posts: 996
Joined: 14 July 2020

Re: The hardest sudokus (new thread)

Postby champagne » Tue Aug 22, 2023 4:40 pm

mith wrote:For the 70 puzzles, had an average of 91.56, max 235.05, min 35.08

The one that took the longest was
Code: Select all
6.......2.9.4...5...1...7...5..84.......2.......3.5.4.2.....6...3...9.8...7.....1;11.80;11.80;11.10;tax;coloin-04-10;13;21;


I'll try rating some later with the pattern game serate to compare.



on my desk computer, nearly inactive out of this run, this one
.......39.....1..5..3.5.8....8.9...6.7...2...1..4.......9.8..5..2....6..4..7.....
took 30 mn using the pattern game version of Sudoku Explainer
champagne
2017 Supporter
 
Posts: 7453
Joined: 02 August 2007
Location: France Brittany

Re: The hardest sudokus (new thread)

Postby mith » Tue Aug 22, 2023 4:49 pm

That seems pretty reasonable; my CPU can go up to 32 threads but it probably averages around 1200-1500% CPU (peaking around 2600%) running the multithreaded version for the high SER puzzles. If the single-threaded runtime for Golden Nugget were exactly 30 minutes and your CPU's speed is comparable, it averaged 1268% here; not a perfect comparison obviously but it should give an idea how much the multithreading matters.

I'd guess it taken about two-three months of my PC's time to rate all of the 11.6+ expanded forms in the depth 3 database (232529). Hard to say how making the contradiction search exhaustive would affect that. Regardless, it's would be a couple orders of magnitude more effort to re-rate all of them now compared to when I started the depth search.
mith
 
Posts: 996
Joined: 14 July 2020

PreviousNext

Return to General