Rating rules / Puzzles. Ordering the rules

Advanced methods and approaches for solving Sudoku puzzles

Postby Allan Barker » Tue Aug 26, 2008 10:08 pm

On top of it, nothing has been done to account the fact that any puzzle having the SK loop is “relatively” easy to solve.

Hi, I would like to inject my experience with such puzzles, which has left me with a different 'feeling' about SK loops. A careful read should show my experience does not necessarily contradict the above, I am just looking for the source of this difference.

At least for EM and Strmckr's puzzle, the SK loops and a few related eliminations are relatively easy first steps after which, both puzzles become much more difficult. In the case of Strmckr's puzzle, the difficulty becomes similar to the more difficult eliminations in GN, making this puzzle much tougher than EM.

It might be that the presence of SK loops (not the loops themselves) indicates the possibility of other eliminations and assignments based on symmetry, which is not reflected in other methods (and in various rating methods).

Or should the ratings of these puzzles go down after the SK loop eliminations and assignements?

Just interested because I shiver when I see an SK loop because I know what's next.
Allan Barker
 
Posts: 266
Joined: 20 February 2008

Postby denis_berthier » Wed Aug 27, 2008 1:18 am

We can observe that no universal rating has yet been devised and I think this will remain the case for a long time.

I think we should ask a few basic questions. I'll leave most of them open.

1) What general purpose do we assign to a rating?
Should it be able to rate all the puzzles? Why?
Should it be able to rate all the puzzles that can currently be solved by a "normal" human player?
Or do we want only a rating for the most common puzzles, from easy to diabolical - for commercial purposes?
On the contrary, do we want a specific rating for exceptionally hard puzzles - for research purposes?
(In this open list, where do we put SER? Commercial? I mean: its declared validity is universal, but isn't its real validity limited to commercial puzzles?)


2) Shall we be happy if the rating satisfies its general purpose or, in addition, should it be based on well defined general principles?
This isn't only rethorical. A rating may be useful even if it isn't based on clearly defined principles but on experience with resolution. Isn't this the case for SER? (I'm really asking; I really don't know.)


3) Inthe second case, what constraints do we impose on the rating? E.g.:
- do we want the rating to be invariant under permutations of rows, columns, floors and towers and under row/column symmetry (SER doesn't even satisfy this very minimal requirement);
- do we want it to be invariant under supersymmetry (so that e.g. fish=naked set of the same size);
- on the contrary, do we want it to be closer to human perception (which would make it a very hard research topic in our current state of knowledge)?


4) Should the rating be based on a well defined backbone?
I mean: as any rating has to be based on a particular set of rules, can we choose a basic set of rules (e.g. some fixed type of chains of increasing length) that will serve as a backbone and define a scale for the rating, all the other rules being then integrated within this hierarchy.
That's how I've proceeded for the NRCZT-rating and that's how Allan could provide another rating with broader scope if he chose some ordering of his second order patterns.
Notice that we have no result on a potential rating based on AICs with ALSs. Such AICs could have been the backbone of SER but, instead, they are given arbitrary values and are arbitrarily mixed with lots of other unrelated patterns.


5) Given two ratings as above, with different backbones, how do they compare?
We've seen that, for non exceptionally hard puzzles, the most common ratings are reasonably well (but not strongly) correlated.


6) How do we assess the validity of a rating?
As there is no universal rating, this is impossible in an absolute sense. One may compare a rating with one's intuitions, but the result is very likely to be different for different persons.
However, given several formal ratings based on different approaches, if they are strongly correlated, it is likely that they all capture some important aspects of the puzzles. This is some statistical cross validity check.
Of course, statistical cross validity itself has limited validity. But the clearer the basic principles of each rating the better we can analyse the discrepancies between the ratings on particular puzzles.


7) When a new resolution rule appears (e.g. Steve's), how can it be taken into account in the existing ratings?
It can easily be observed that the addition of a really new resolution rule radically modifies any rating - in the sense that it reverses the relative places of many puzzles.
But a single resolution rule with no parameter can hardly be the basis for a new rating if it is not itself included in a broader parameterised set of rules. As of now, what such a set should be for Steve's pattern remains unclear as it is only inherent to the exceptionally symmetric EM family.
Last edited by denis_berthier on Tue Aug 26, 2008 10:20 pm, edited 1 time in total.
denis_berthier
2010 Supporter
 
Posts: 4234
Joined: 19 June 2007
Location: Paris

Postby champagne » Wed Aug 27, 2008 2:15 am

[quote="Allan Barker"]
At least for EM and Strmckr's puzzle, the SK loops and a few related eliminations are relatively easy first steps after which, both puzzles become much more difficult. In the case of Strmckr's puzzle, the difficulty becomes similar to the more difficult eliminations in GN, making this puzzle much tougher than EM.
.


I would have a different wording, but likely with the same background.

EM, after the SK loop has been cleared, is level 3 in my process. Strmckr’s requests additional steps at level four.

Nevertheless, in both cases, the overall path is much simpler than for SP,GN and we are to your second point :

[quote="Allan Barker"]

It might be that the presence of SK loops (not the loops themselves) indicates the possibility of other eliminations and assignments based on symmetry, which is not reflected in other methods (and in various rating methods).
.


I already gave my own feeling.

Existence of the SK loop provides a two belts structure opening doors for new clearings. It is enough to make it solvable in an easier way than expected.

Most of the puzzles deviating from the ratings have such specificities. The small example I gave has a specific pattern in column c3 that can explain why there is a lot of ways to crack it in once.

Players detect very fast now such patterns and produce very nice solutions.

[quote="Allan Barker"]

Or should the ratings of these puzzles go down after the SK loop eliminations and assignments?

Just interested because I shiver when I see an SK loop because I know what's next.



The problem with specificities is that there are a lot of them. Rating means running a program. Even if the specificity is well identified and easy to classify, it can be that it does not fit with the general scope of the program. My solver for example is not prepared, for the time being, to receive the specificity I mentioned in my example.

BTW, my solver is much faster to-day giving the boring basic solution than looking for speedy cracking.

Champagne
champagne
2017 Supporter
 
Posts: 7465
Joined: 02 August 2007
Location: France Brittany

Postby denis_berthier » Wed Aug 27, 2008 2:46 am


Human vs computer solving and rating


One common feature of existing ratings is that they are produced by a computer, based on fixed set of rules.
A human solver can dynamically combine the resolution rules from a given fixed set (e.g. using my partial chains theorems, if the set is NRCZT). In the 7th post of this page http://forum.enjoysudoku.com/viewtopic.php?t=5600&start=135, I've given an example of a manual solution, with the insertion of a remote pairs chain within an nrczt-chain.
Unfortunately, allowing such dynamic combinations in a solver/rater would make it very inefficient - increasing exponentially the number of patterns to be analysed in a systematic way.

The problem is that human solvers may have shortcuts to detect very special cases of general patterns whereas, as long as we have no idea of what is so easily detected by a human solver, we have no choice but having our solvers check all the cases. This is more or less what Champagne says also in the previous post.

This add one more general question to my previous list: do we care about the computation times?
denis_berthier
2010 Supporter
 
Posts: 4234
Joined: 19 June 2007
Location: Paris

Postby denis_berthier » Wed Aug 27, 2008 2:56 am

coloin wrote:Solving methods - such as using double [or triple!] proposition clues - which strmcker does are to be respected.


Hi Coloin,
Can you give references to such methods?
Do you include Steve's pattern in this class?
denis_berthier
2010 Supporter
 
Posts: 4234
Joined: 19 June 2007
Location: Paris

Postby champagne » Wed Aug 27, 2008 3:58 am

denis_berthier wrote:[b]
This add one more general question to my previous list: do we care about the computation times?


I would say yes if you want to qualify a lot of puzzles.

Another point is that processing time is also one way among others to "rate" puzzles. In that case, you rate togother a set of rule and the process combining them.

Anyway, in that case, I am switching from an average 0.3 second to an average 2 seconds (for puzzles around 9/9.5 SE).

Nevertheless, you can not mix both options.

In my sample ranking within each option is very similar.

Rating assumed potential hardest puzzles thru my solver requests from 2 to 120 seconds per puzzle. It starts to be a significant parameter. I red if I am right that a rating program can last hours for hard puzzles.

champagne
champagne
2017 Supporter
 
Posts: 7465
Joined: 02 August 2007
Location: France Brittany

Postby tarek » Wed Aug 27, 2008 9:26 am

The following quote came from here.
On 18.0407 I wrote:This is a line chart of "Move vs. ER" of this puzzle until the singles tail:
Code: Select all
..2...6...7.1...8.4.......3.5..9..7....71.........2...3.......4.8..7..5...6...2..

Image

IMO, this method would describe the puzzle's difficulty more accurately than just a simple "Hardest step"

Chmpagne quoted a single step solving a an SER 9.0-9.5 puzzle, SE would have given the same rating if it needed 100 similar level steps to solve.

Would my suggestion above be useful for breadth 1st batch solvers ?

tarek
User avatar
tarek
 
Posts: 3762
Joined: 05 January 2006

Postby coloin » Wed Aug 27, 2008 10:00 am

There is very little that I can disagree on here, and very good questions raised by denis

SE has given us a solid rating heirarchy for puzzles - puzzles which recreational puzzlers can solve on paper.

More recently I have been generating more "extreme" puzzles, and I hope to communincate with champagne with these in the near future. Specifically I have been generating 21 clue puzzles.......a rough estimate at the limits of this platform have been ~50 million 21-clue puzzles per grid. So I dont think these hard puzzles are that rare ! Also there are hard puzzles with 21 clues [plus a single] - effectivly 22 clues therefore.......and their are ? x50 more of these puzzles - so Ive been looking at a few [million] of these too.

I can only suspect that any solver program which "solves" a puzzle has an "assumptive" method....SE no exception.....therefore other rating methods like node count in suexrat9/suexratt which effectively reflect the average number of guesses reguired to get a unique solution have a good basis. champagnes solver / processing time value must be as good if not better than most.

This is even more relevant when we have solvers like strmckr who can effectivly pinpoint a few bivalue points in a puzzle - and go on to solve quite difficult puzzles in less than 2 minutes !

http://www.sudoku.org.uk/SudokuThread.asp?fid=4&sid=10289&p1=1&p2=11

http://www.sudoku.org.uk/SudokuThread.asp?fid=4&sid=10308&p1=1&p2=11

I am sure strmckr will comment soon !

C
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

ER vs Move Number

Postby Glyn » Wed Aug 27, 2008 12:32 pm

Tarek In regard to the interesting graph of ER vs move number. On the plateaus where SE is systematically working through a batch of moves of nearly level complexity (SE measure) there may be critical moves that reach the next valley quicker. Tweaking the order might produce dramatically different plots.

I wonder if a human solved puzzle with the ER computed at each stage by SE would look similar. Of course SE can only deduce an elimination in a limited number of ways which sometimes leads to it exaggerating its complexity. Also the human solver may aim at breaking the puzzle by targeting the eliminations.
Glyn
 
Posts: 357
Joined: 26 April 2007

Postby coloin » Wed Aug 27, 2008 1:00 pm

More evidence of SEs limitations I fear.......

A while back on Eureka
coloin wrote:Seriously though - I believe plugging in two candidates at a time - using vision which strmckr has - might just be the reason why he can solve puzzles so quickly !! [and he can solve puzzles very quickly]

strmckr wrote:thanks colin:)

where did u see me solve fast?

paired propagation is only the start. try , triplicate:) for most. paired is usually the result of a triple which i start with. (insert 1 clue - leaves a pair = als)

i solve most puzzles with aals and messed up chains i still cant explain there just there.

coloin wrote:Sudoku Explainer doesn’t take this double proposition pair technique into account when rating puzzles - perhaps it should.

strmckr wrote:it should do more then that.:) it should use triples as well.

then re curve the complexity rating it gets bottlenecked at 11.4 as the highest rating.

re-name many of its technique to more accurate descriptions, incorporate many of the new techniques now known like the "SK loop " or hidden pairs.


C
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

Postby champagne » Wed Aug 27, 2008 1:24 pm

tarek wrote:Champagne quoted a single step solving a an SER 9.0-9.5 puzzle, SE would have given the same rating if it needed 100 similar level steps to solve.


Ok,

To follow a later comment from coloin (quoting strmckr), it should be necessary first that the solver find that "single shot".For the time being, my solver does not see it.:D (what he did was slightly more complex, a very skilled player "abi" found the final shortcut)

I fully agree in principle with your comment. Evil puzzles have been long a sequence of single moves with a small number of paths).

May be this is a reason why I use as one parameter describing the difficulty the length of my print. (Nothing scientific, just the underlying idea that a long print reflects repetition of difficult steps.

champagne
champagne
2017 Supporter
 
Posts: 7465
Joined: 02 August 2007
Location: France Brittany

Re: ER vs Move Number

Postby DonM » Wed Aug 27, 2008 4:47 pm

Glyn wrote:Tarek In regard to the interesting graph of ER vs move number. On the plateaus where SE is systematically working through a batch of moves of nearly level complexity (SE measure) there may be critical moves that reach the next valley quicker. Tweaking the order might produce dramatically different plots.

I wonder if a human solved puzzle with the ER computed at each stage by SE would look similar. Of course SE can only deduce an elimination in a limited number of ways which sometimes leads to it exaggerating its complexity. Also the human solver may aim at breaking the puzzle by targeting the eliminations.


Coloin: SE has given us a solid rating heirarchy for puzzles - puzzles which recreational puzzlers can solve on paper.


I find SE very useful in giving me a general idea as to the difficulty of puzzles and at an SE level (ER if you like) of 8.3 I often find very interesting puzzles and at about 8.5, IMO, a puzzle is going to almost always be a real challenge. I also find the SE-related information in this thread interesting & somewhat useful. However, although it still has its uses in giving us an idea as to a puzzle's difficulty level, I get the impression that people sometimes lose sight of the fact that SE is a fairly old, out-dated, solver and are giving it far more credit/attention than it deserves both when it comes to its solving methods & the accuracy of its difficulty rating.

Note the following from the SE FAQ!:

The Sudoku generator uses the following levels:
........
Fiendish: difficulty 2.6 to 6.0 (can only be solved by writing down candidates, but does not require Forcing Chains)
Diabolical: difficulty 6.2 or more (can only be solved with Forcing Chains)
DonM
2013 Supporter
 
Posts: 487
Joined: 13 January 2008

Postby RW » Thu Aug 28, 2008 4:28 pm

champagne wrote:
tarek wrote:Champagne quoted a single step solving a an SER 9.0-9.5 puzzle, SE would have given the same rating if it needed 100 similar level steps to solve.


Ok,

To follow a later comment from coloin (quoting strmckr), it should be necessary first that the solver find that "single shot".For the time being, my solver does not see it.:D (what he did was slightly more complex, a very skilled player "abi" found the final shortcut)

Finding the shortcut is easy enough with SE, if you're willing to do some manual work. Just find the backdoor that can be solved by eliminating the fewest candidates, then use "Get all hints" in SE to find out what it takes to eliminate those candidates. In this paricular puzzle, SE lists that chain as ER 9.0. In fact, SE shows exactly the same chain as "abi" found, which makes me wonder if this isn't the technique he uses... The odds for not only finding the same chain, but also describing it in the exact same way are rather small. Describing this chain like SE as a "region forcing chain" feels a bit odd to me, as the elimination also can be described as a basic forcing chain:
Code: Select all
*-----------------------------------------------------------------------------------------------*
 | 567       1         2         | 5689      5689      56789     | 3         579       4         |
 | 4         35679     5789      | 23569     1         235679    | 6789      579       5678      |
 | 3567      35679     5789      | 3569      3569      4         | 16789     1579      2         |
 |-------------------------------+-------------------------------+-------------------------------|
 | 127       279       4         | 12389     2389      12389     | 5         6         1378      |
 | 1256      2569      159       | 12345689  7         1235689   | 189       1349      138       |
 | 1567      8         3         | 14569     569       1569      | 2         1479      17        |
 |-------------------------------+-------------------------------+-------------------------------|
 | 8         4         15        | 7         23569     123569    | 16        1235      1356      |
 | 12357     2357      157       | 123568    4         123568    | 167       12357     9         |
 | 9         2357      6         | 1235      235       1235      | 4         8         1357      |
 *-----------------------------------------------------------------------------------------------*

r6c8 = 7r46c9 - 7r9c9 = 7r9c2 = 23r8c12 - 23r8c8 = 23r57c8 = 4r6c8

=> r6c8<>19

The reason why SE doesn't see the elimination like this is probably that it uses pairs and thus would rate as ER>10. Not worth that much IMO.

RW
RW
2010 Supporter
 
Posts: 1010
Joined: 16 March 2006

Postby denis_berthier » Fri Aug 29, 2008 1:39 am

tarek wrote:Only 2 puzzles to date have managed to have a backdoor size 2 for the large set of techniques in gsf's solver arsenal ... these 2 happen to be in top tier of difficult puzzles.

What are these 2 puzzles (with their backdoors)?
As EM has backdoor size 1 for NRCZT, I conjectured that any puzzle has backdoor size 0 or 1 wrt NRCZT. I'd like to check this for these 2 puzzles.

Backdoor size has the advantage of depending explicitly on a specific set of rules.
But, backdoor size (for a given set of rules) is not enough for characterising the complexity of a puzzle, because:
- even with the smallest set of rules (singles), it takes only 3 values (for all known puzzles); this is too crude;
- it is unrelated to the way the backdoor itself can be proven.
Wrt the second problem, I think any measure based on backdoors should refer to strong backdoors - as defined at the bottom of this page http://forum.enjoysudoku.com/viewtopic.php?t=5600&postdays=0&postorder=asc&start=75.
denis_berthier
2010 Supporter
 
Posts: 4234
Joined: 19 June 2007
Location: Paris

Postby champagne » Fri Aug 29, 2008 2:39 am

RW wrote:Finding the shortcut is easy enough with SE, if you're willing to do some manual work. Just find the backdoor that can be solved by eliminating the fewest candidates, then use "Get all hints" in SE to find out what it takes to eliminate those candidates. In this paricular puzzle, SE lists that chain as ER 9.0. In fact, SE shows exactly the same chain as "abi" found, which makes me wonder if this isn't the technique he uses...

To be more accurate, this was coming from a cooperation between 'abi' and my solver two weeks ago. The original solution of 'abi' is very similar to your chain and I give it just after.

Meantime, my solver, whch is not looking specfically for 'back doors" recognized the clearing of 9r8c6 as having strong potential and gave the same solution. I just posted a brief description of the process used here

http://forum.enjoysudoku.com/viewtopic.php?t=5624



RW wrote:
Code: Select all
*-----------------------------------------------------------------------------------------------*
 | 567       1         2         | 5689      5689      56789     | 3         579       4         |
 | 4         35679     5789      | 23569     1         235679    | 6789      579       5678      |
 | 3567      35679     5789      | 3569      3569      4         | 16789     1579      2         |
 |-------------------------------+-------------------------------+-------------------------------|
 | 127       279       4         | 12389     2389      12389     | 5         6         1378      |
 | 1256      2569      159       | 12345689  7         1235689   | 189       1349      138       |
 | 1567      8         3         | 14569     569       1569      | 2         1479      17        |
 |-------------------------------+-------------------------------+-------------------------------|
 | 8         4         15        | 7         23569     123569    | 16        1235      1356      |
 | 12357     2357      157       | 123568    4         123568    | 167       12357     9         |
 | 9         2357      6         | 1235      235       1235      | 4         8         1357      |
 *-----------------------------------------------------------------------------------------------*

r6c8 = 7r46c9 - 7r9c9 = 7r9c2 = 23r8c12 - 23r8c8 = 23r57c8 = 4r6c8

=> r6c8<>19


'abi ' produced a chain close to yours, maybe still more difficult to include in a solver. Sorry, this is in chess mode, but I guess you can manage it.


Code: Select all
*7b9/23b9-23ab8/23h8-1579h1238/9h56-9g5/9c5-15c57/15c8-7c8/7b9 et fin.


I am sure that 'abi' is working without any computer, so I find very impressive what she can produce.

In case I would like to teach my solver to find such chains, I would have to learn it how to handle in an appropriate way the AHS/AC r8c2r9c2, what I can do easily I guess, but also what she did in column c8, and this is for the time being out of my field.

Champagne
champagne
2017 Supporter
 
Posts: 7465
Joined: 02 August 2007
Location: France Brittany

PreviousNext

Return to Advanced solving techniques