The hardest sudokus (new thread)

Everything about Sudoku that doesn't fit in one of the other sections

Re: The hardest sudokus (new thread)

Postby mith » Tue Aug 29, 2023 3:05 pm

eleven, I think we are in agreement that it would be a good thing to have a clean list of puzzles which do not solve/simplify with human-approachable exotic patterns, so that those who are interested can try to find new approaches.

It's also true that there has been an explosion of high SER puzzles found due to neighborhood searching the T&E(3) space. This is not surprising, given that these puzzles tend to be at higher clue counts and have a fairly isolated pattern at the core which makes it more likely generated neighbors are equally "hard" (for SER or T&E).

I disagree that the discovery of this pattern is what changed things. It may have been the case that only experts could apply things like MSLS and Exocets in the past; the development of SET has changed that considerably. Perhaps not on this specific forum, but elsewhere.

For myself, I have not been searching for "new extremely hard puzzles" at all; I have been searching for T&E(3) puzzles, some of which may be extremely hard. That's why I have been posting those results in the other thread. (The only reason I have been posting here at all lately is to help clarify what has already been done in this field so those who are using my results as seeds aren't repeating work.) But I certainly don't fault anyone for posting their high SER puzzles based on those patterns here, given that the most recently published version of the ph database clearly uses SER as its standard with no mention of throwing out puzzles that simplify. And there should be a place for that (I think we can all agree that it would be noteworthy for someone to find an SER 12.0, even if that puzzle were relatively easy to solve with TH, SET, or something else). Whether that's this thread or a new one, I don't much care; threads on the internet rarely stay on topic, and the people currently contributing puzzles do not necessarily have the same goals or views on the purpose of this thread and those who are no longer contributing puzzles or even on the forum.
mith
 
Posts: 996
Joined: 14 July 2020

Re: The hardest sudokus (new thread)

Postby denis_berthier » Tue Aug 29, 2023 3:36 pm

.
There's one more point nobody has evoked about tridagons: the degenerate versions. As soon as a candidate is missing, the pattern:
- is in T&E(2) (as I've proven in the tridagon thread); the puzzle will therefore not be in the T&E(3) database;
- is such more difficult to find - the more so as degeneracy increases (and also depending on the number of guardians).
(However, the same ORk resolution rules apply to it.)

Until now, we have a good knowledge of the T&E(3) domain, but not a good knowledge of the t&E(2) part of the tridagon domain, let alone of the degenerate tridagon one.

Mith, what do you do when you find a puzzle in T&E(2) with a tridagon? What about the degenerate ones? Can you find them easily?
denis_berthier
2010 Supporter
 
Posts: 4236
Joined: 19 June 2007
Location: Paris

Re: The hardest sudokus (new thread)

Postby mith » Tue Aug 29, 2023 5:08 pm

My current script will find any potential trivalue oddagon, degenerate or not, useful or not (some of them have 30+ guardian candidates); it additionally highlights the "special" case (always nondegenerate by definition), which is the case that is easily spottable by a human solver (for some, there is more than one choice for the full pattern because of multiple empty diagonals in a single box, but these are nevertheless easy to find). I believe this special case always has the lowest guardian count in the puzzles I've checked; I may tweak this for the upcoming update to explicitly find patterns with the best case guardian cell/candidate counts and ties, to confirm whether this is always true.

I've never run it on the T&E(2) puzzles, but it would certainly be feasible to do so at the highest SER and filter out puzzles with low guardian counts, degenerate or not, that sort of thing.
mith
 
Posts: 996
Joined: 14 July 2020

Re: The hardest sudokus (new thread)

Postby mith » Tue Aug 29, 2023 5:24 pm

(I've been sick the past few days btw, won't be making any progress on any of this coding until I'm feeling better.)
mith
 
Posts: 996
Joined: 14 July 2020

Re: The hardest sudokus (new thread)

Postby Paquita » Tue Aug 29, 2023 5:53 pm

Denis

Quite right. I thought it might help mith if I gave my file with non-T&E(3) puzzles, but there were 2 errors in it. I should have contributed some puzzles to JPF that I did not; and there were some non-minimal puzzles in there (this file is the result of scraping this forum plus the ph2010 database : 11.6 plus.)

Other than that, I think it is up to date, anyway I have a new version without these two errors, but if it is not useful anyway I do not need to publish it.
Paquita
 
Posts: 132
Joined: 11 November 2018

Re: The hardest sudokus (new thread)

Postby eleven » Tue Aug 29, 2023 7:05 pm

Mith,

you repeated, that SET can be easily spotted and is a mighty technique, but i never saw a paper or thread, where this would be defined and described. It seems, that it is just a general base/cover method (Set Equivalence Theory) with a lot of special applications, presented in long youtube videos.
The main tool of SET users seems to be XSudo to check the correctness of complex moves.
Is it implemented in any solver ? Are there any statistics, how many of the old hardest puzzles could be essentially simplified ? More than a small part ?

So i don't think, you can compare it to other well defined techniques and by far not with the simple TH pattern. Hopefully someone will add TH to the Explainer techniques (second step including the remote triple, if available) and check, which part of the millions puzzles will remain above 10.0. My guess is not more than 1 %.
eleven
 
Posts: 3173
Joined: 10 February 2008

Re: The hardest sudokus (new thread)

Postby mith » Tue Aug 29, 2023 7:48 pm

I'll reference one of my "Potential Hardest" series in the Puzzles forum (note the quotes, recognizing that high SER does not = hard in all cases):

"Potential Hardest" 12

Viewing this as MSLS/Multifish involves looking at a lot of pencilmarked cells, finding truths/links. Here's how SET would approach it:

f-puzzles

SET at its core is simply the observation that if you "add" and "subtract" houses, the resulting "positive" and "negative" partitions are equivalent modulo complete sets of 1-9. For the example above, you could make r15678 pink (5 sets), c3489 (4 sets) blue, and erase the overlap. The resulting partition gives pink as 5 sets - overlap, and blue as 4 sets - overlap, so:

pink = blue + 1 set of 1-9

There are 8 digits from 3456 given in blue, so we need 12 (8 + 4 from the extra set) in pink, and there are 12 empty cells. Therefore all the empty pink cells are from 3456 (and all the empty blue cells are from 12789), bte

The deduction is the same as the equivalent MSLS, just much easier to see because there is no need for pencilmarking anything else in the grid. The row/column variety is usually quite easy to spot when it exists - you're just looking for digits that appear together frequently. I'm intending to write code at some point to follow this heuristic for finding them (much like the TH special case).

SET is at this point common in puzzles on the Cracking the Cryptic channel as well as other solving channels, particularly in variant sudoku (where often you don't necessarily need to know the specific digits and can instead use the sums), though it was originally formulated based on classic sudoku (and applies to all latin square puzzles in row/column form; with sudoku you can of course add boxes as well).

I've done some stats on ph puzzles solvable with rank 0 techniques before, but it's been a while. (Unfortunately, the code used for that is quite slow, because the rank 0 thing is not the focus, only a side result.) At the time I checked, there were four 11.7s in this category:

Code: Select all
....5.7.....9.1.6.........52....8.16.......2..3....4.7.7..4......82.9...9.68.....;........1.....1.2....34.5........6....6.7.4.5.8.....92.2...9...5.37.....7..63....;11.7;11.7;9.4;col 12_12
........1.......2...3..45......5.....6..374..8..1...9...4.7.6...76..5...9.......2;........1.......2...3..45......5.....6..374..8..1...9...4.7.6...76..5...9.......2;11.7;11.7;8;  dob 12_12_03
........1.....2.3....45.6....1....72..6...8...9..8.4....7..3...5..89....96.5.....;........1.....2.3....45.6....1....72..6...8...9..8.4....7..3...5..89....96.5.....;11.7;11.7;2.6;dob 12_12_03
........1.....2.3....45.6....1....27..6...8...9..8.4....7..3...5..89....96.5.....;........1.....2.3....45.6....1....27..6...8...9..8.4....7..3...5..89....96.5.....;11.7;11.7;2.6;dob 12_12_03


(Another 20 11.6s. And I think I've found some 11.8s since.)

This only accounts for fully rank-0 solvable puzzles, though. I would need to write a new script to combine SER and SET, just as I would for SER and TH.

What is less explored in this field is ASET (Almost SET), which yields a set of candidates exactly one of which is true (much like the guardians of a trivalue oddagon providing a set of candidates at least one of which is true). It is possible to do OR-branching or XOR-branching on this, but I don't really have an example of a puzzle which solves nicely in this way. That said, this would be an interesting thing to check for in the ph database as well.
mith
 
Posts: 996
Joined: 14 July 2020

Re: The hardest sudokus (new thread)

Postby mith » Tue Aug 29, 2023 8:29 pm

Here's an example of how you might find a JExocet using SET:

Golden Nugget

We might first notice that only 1247 appear in c1245. We'll take these as pink and r347 as blue (containing only 35689). This is a valid SET partition, and again pink = blue + 1 set of 1-9. However, unlike in the previous example, this one has two degrees of freedom (we need to place 14 digits from 35689 - the 9 in blue + 5 from the extra set - but we have 16 empty pink cells).

Instead, we see that box 3 also has a lot of digits from the "blue" set, so we can try adding that in. Now pink = blue (with the dark blue cells counted twice). This now has three degrees of freedom, so it's worse in that sense, but now we identify the exocet base cells and make use of them.

The orange cells are from 1247, all "pink" digits. 1247 appear twice each in pink, so we need them to appear twice each in blue (remembering that dark blue is doubled, so once there suffices). However, the orange cells see all but two of the empty blue cells (purple, the target cells), so whichever digits those are they must appear in purple or we won't have two copies in blue. Therefore -3r7c9. (I haven't looked at applying additional exocet rules using this, but they should translate.)
mith
 
Posts: 996
Joined: 14 July 2020

Re: The hardest sudokus (new thread)

Postby marek stefanik » Tue Aug 29, 2023 10:25 pm

To throw my two cents in:

I generally agree with eleven in that the search for hard puzzles should account for all known techniques and not just a selected few.
To better predict which puzzles get broken apart by exotic patterns, this might help:
Alongside the T&E depth, we can consider another simple heuristic (as far as describing it goes, it might not be simple to calculate): the size (let's say measured as the number of truths) of a pattern, i.e. in TH 11 cells are required to prove eliminations elsewhere.
While it doesn't say much for puzzles requiring long but simple chains versus puzzles solvable with relatively short but complex nets, it seems weird to accept puzzles solved by an 11-truth pattern as 'hardest sudokus'.

I would argue that SET isn't the all-simplifying POV people consider it, and to show it, here is how I would spot the patterns in the above two puzzles:
All ways start with identifying 3456 in c3489 and the remaining digits in r1(5)678.
Let me first describe the SET again: You count the 3456s in the cells included only in the columns and how many of them you can fit in the cells only in the rows, and get an equality.
Now, how is it better than the multifish way: you check how many 3456s you could fit into the rows with only cell, column, and box restrictions (I usually ignore the boxes at first and do a recount if I get close). You get 16 (20), which is the number of them you need.
Or the MSLS way: you look at the intersection of the rows and columns and digit by digit count (even looking at the givens, no pencilmarks needed) how many times can each of them appear in the intersection. Again, you get exactly as many as you need (I used to spot MSLSs this way when I joined the forum, but have since switched to multifish. Though i have seen a puzzle where you had to check the remaining digits and use one of them, usually you only need the 4 or 5 digits, which is faster to count).

As far as JE go, with the most stereotypical setups it could hardly get easier: you see r347c35, check the digits in between them, see that 1247 can only appear twice each in r347c123456, you check the companion cells, they give you r12c7, r56c9, r78c8 as possible bases, you check those, done. I'm sure you could teach a monkey to find them if you got bored and wanted the monkey to get bored as well. It's also useful to check for cases with exactly one digit missing (and/or with only 3 possible base digits), there are various setups like that, but more than that is almost never useful.

Marek
marek stefanik
 
Posts: 360
Joined: 05 May 2021

Re: The hardest sudokus (new thread)

Postby eleven » Tue Aug 29, 2023 11:17 pm

Thanks for the insights, i am really interested, but in this thread i want to consist on my point:
Mith is talking of 1 of 9 11.9's, 4 of 70 11.7's and 20 of 372 11.6's (if i counted right) in the old hardest database, which can be solved by experts in a way, which is manually reproducible.
I am talking about millions of puzzles in the TH database (the big majority) with same ratings, which can be solved manually by any experienced manual solver, who knows the TH pattern.
eleven
 
Posts: 3173
Joined: 10 February 2008

Re: The hardest sudokus (new thread)

Postby mith » Wed Aug 30, 2023 12:31 am

marek, of course there are equivalent ways of looking at them. I can't speak for everyone as far as which form is easier - IMO the SET formulation has advantages (one of which is that its basis is so foundational: the equivalence property is always true of the solved grid, even for choices of partition that are completely useless to a solver; just as every trivalue oddagon cell pattern is not 3-colorable regardless of whether it's a good choice of cells). Regardless, these are easy to spot if you know about them, and certainly no longer the realm of only "experts".

eleven, I would rephrase that as:

I am giving as an example of an alternate approach 1 of 9 11.9s with an exocet (IIRC 5 or 6 have some exocet?), and I am stating that 4 11.7s and 20 11.6s are solvable with SET + basics only. I am clearly not claiming that this is an exhaustive study of which puzzles in the old database have some SET (or exocets) which simplify them, nor by what degree they are simplified. (I don't think Golden Nugget is simplified much at all by that one elimination, for example; on the other hand, there are definitely examples of puzzles with high SER which are solvable by exocets + basics, and at least a few have been posted in the Puzzles forum.)

You are talking about millions of puzzles in the depth 3 database, which makes no claim to be a database of hardest puzzles by any measure other than depth. You may well be right that most of them are solvable with some "basic" TH strategy (only about a third have TH-1, but I've never done a check on the exact properties of those with 2 guardians, say, never mind finding the SER after applying TH) - I think you are likely overestimating, but we'll eventually see. You're welcome to check out the other thread, where there is a bunch of data on guardian counts for all the min-expands in the last update.

And I think everyone in this thread agrees that a puzzle solvable by TH-1 + basics is not a candidate for being considered the "hardest" anything. All I am saying is the effect TH has on some individual puzzle needs to be determined before throwing it out, and the same is true of any other exotic technique.
mith
 
Posts: 996
Joined: 14 July 2020

Re: The hardest sudokus (new thread)

Postby denis_berthier » Wed Aug 30, 2023 1:58 am

.
Sometimes, I have the impression that people live in an alternate reality.
Many months ago, I provided detailed results about the T&E(3) database, at a time when it had only 63137 min-expands (see CSP-Rules User Manual).
Even with a very broad interpretation of the word trivial (in which chains of length 5 would be trivial), there are only 25728 puzzles that can be "trivially" solved after applying the basic tridagon rule with only one guardian. From what I learnt at elementary school, the "vast majority" thus reached is of only 40,7%.
Even if you extend the meaning of trivial to chains of length 8 (which is far beyond the capabilities of normal human solvers), only 30528 puzzles with 1 guardian are "trivially solved" - which makes a "vast majority" of 48.3%.
I apologise in advance - at elementary schools here, they teach only European maths, maybe you in the US have different ways of computing ratios.

[Edit]: the same reference gives the numbers for 2 guardians, for 3, ..., for 5.
denis_berthier
2010 Supporter
 
Posts: 4236
Joined: 19 June 2007
Location: Paris

Re: The hardest sudokus (new thread)

Postby denis_berthier » Wed Aug 30, 2023 2:06 am

Paquita wrote: I thought it might help mith if I gave my file with non-T&E(3) puzzles, but there were 2 errors in it. I should have contributed some puzzles to JPF that I did not; and there were some non-minimal puzzles in there (this file is the result of scraping this forum plus the ph2010 database : 11.6 plus.)
Other than that, I think it is up to date, anyway I have a new version without these two errors, but if it is not useful anyway I do not need to publish it.

Hi Paquita,
As you've already done the work, it's better to update what you've published.
I can't speak for anybody else, but I think it's useful to have it. It seems mith will need some time before publishing his extensive version.

BTW, how did you check that puzzles are in T&E(2)?

[Edit]: it may also be useful to reorder the list, so that the puzzles appear according to the order of their SER, as usual. Maybe the problem is different: when I download the version available today, I get 53xxx puzzles. You probably copied the same list several times.
denis_berthier
2010 Supporter
 
Posts: 4236
Joined: 19 June 2007
Location: Paris

Re: The hardest sudokus (new thread)

Postby champagne » Wed Aug 30, 2023 9:07 am

eleven wrote:Thanks for the insights, i am really interested, but in this thread i want to consist on my point:
Mith is talking of 1 of 9 11.9's, 4 of 70 11.7's and 20 of 372 11.6's (if i counted right) in the old hardest database, which can be solved by experts in a way, which is manually reproducible.
I am talking about millions of puzzles in the TH database (the big majority) with same ratings, which can be solved manually by any experienced manual solver, who knows the TH pattern.

Hi eleven,

Just something I wrote earlier, closer to the "hardest sudokus" hunt.

if the TH pattern has many guardians, the "primary hardness" can be seen as the hardest elimination leading to the last guardian. With so many clues, I have doubts that this will be a very high SER rating.
Having applied the TH rule, the SER will go down sharply as we have always seen with other exotic patterns. It does not make always the puzzle trivial, but it discards it from the list of potential hardest.

With these limits, I think that (the big majority) is ok for me.

EDIT: and for a manual solver, after the main effect of an exotic pattern, several ways to crack the puzzle not known by SER are usually available.
champagne
2017 Supporter
 
Posts: 7465
Joined: 02 August 2007
Location: France Brittany

Re: The hardest sudokus (new thread)

Postby eleven » Wed Aug 30, 2023 9:42 am

Denis,

thanks for your stats, i had missed them. Since i don't have any cpu resources, i can't check it myself. Whenever i tried randomly a puzzle from a posted list, it was easy to solve, and i thought, those you had posted in the puzzles thread, were the hardest (all were manually solvable).
So "only" 40% of 3 mio (?) puzzles with extremely high ratings are easy to solve and, as champagne noted, a big part of of the rest can be (easily) essentially simplified to a lower rating.

@mith: I don't want to criticise your great work or your database. Obviously you are not responsible for the fact, that TH puzzles are hopelessly overrated.
Concerning exocets, they are well known on this forum, because they come basically from the work of champagne and David Bird. If SET offers a new way to find them easier manually, it's great. But it does not change the fact, that only a small part of the old hardest list could be essentially simpilfied with this technique.

Ok, i have tried it again. It seems, that there is little interest, that anyone would do the work to "repair" the misleading ratings in order to get a new hardest list, which earns it's name.
eleven
 
Posts: 3173
Joined: 10 February 2008

PreviousNext

Return to General