Rating rules / Puzzles. Ordering the rules

Advanced methods and approaches for solving Sudoku puzzles

Postby PIsaacson » Thu Apr 23, 2009 9:05 pm

Denis,

I became involved in studying ALSs and Allan's set/link-set logic to the point that I neglected to complete my promised statistics for the sudogen0 collection. Mea culpa. I need to re-trace my steps and re-read this thread to get back in sync...

Cheers,
Paul
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Postby Allan Barker » Fri Apr 24, 2009 6:47 am

StrmCkr wrote:nice to see you back at it again denis

Ditto. Hi Denis, welcome back.

Allan
Allan Barker
 
Posts: 266
Joined: 20 February 2008

Postby denis_berthier » Fri Apr 24, 2009 8:49 am

Hi Paul, Allan, StrmCkr

Glad to see you're still here.

As for me, I've been (and still am) completely involved in other topics and out of the sudoku world for several months - except for the brief incursion in the above posts.

StrmCkr, as I still have no time for Sudoku, I think you should open a thread to discuss your new ideas instead of sending them to me alone ; you'd certainly get much more interesting comments.

Paul, if you complete your computations for whips/braids with ALS inserts on sudogen0, I'm still interested. Given my above results, it doesn't matter much if you do them with braids or whips. I'm not sure : does your solver have "pure" ALS-chains (with no z- or t- candidates) ? Does it have their supersymmetric counterpart (i.e. with AHS and super-hidden-ALS or "A-fish" - whatever you call them)?

Allan, as I'ven't been reading what has been published in the last few months, you may have already answered these old questions of mine : have you defined any measure of complexity for your patterns? If so, can your algorithm be tuned to provide resolution paths that grant the "simplest" solution according to this measure?

Indeed, I'm still interested in anything that deals with principled measures of complexity for well defined families of patterns and in any comparison between such measures - the topic of this thread. I'm aware of nothing of this sort for pure ALS-chains.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Postby ttt » Fri Apr 24, 2009 3:25 pm

Me too, nice to see you back after around half year:D

ttt
ttt
 
Posts: 185
Joined: 20 October 2006
Location: vietnam

Postby denis_berthier » Sun Apr 26, 2009 8:01 am

ttt wrote:Me too, nice to see you back after around half year:D

ttt


Hi, ttt, nice to see you.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Postby PIsaacson » Mon Apr 27, 2009 3:07 am

Denis,

I don't have braids with ALS/AHS group extensions, so I can only execute using nrczt chains/whips.

Your question on "pure" ALS chains raises a point that I've been considering lately from a programming perspective. Currently, I have a separate engine for ALS chaining and a separate engine for nrczt chaining. They both use a common ALS generator that builds the map of all possible ALS/AHS pairs within RC/RN/CN spaces, so: Yes, both engines have all the supersymmetrics available for use.

However, they build chains with ALSs using very different algorithms. In the nrczt engine, preference is (usually) given to linking 3d nrc candidates and attempting z/t promotion when possible. ALS/AHS nodes are (usually) only considered after exhausting all other possible nrc children of a given parent node. The ALS engine always attempts to chain ALS/AHS nodes directly via RCC peer linkage first, and then attempts to chain via a limited number of techniques such as using bi-value/local cells, ERs, GSLs, AURs and X-cycles to "bridge" RCCs from a parent ALS to a potential child ALS. At some point, I would like to have these two techniques "merge" into a super-engine.

I qualified the nrczt engine with the phrase "usually" because for these tests, I opted to have the nrczt engine select an ALS whenever possible instead of as a last resort. It made the results for short chains entirely different and came close to emulating the ALS engine restricted to linking no more than 2 or 3 ALSs.

So with that out of the way, here are the results from running nrczt-whips with ALS/AHS group extensions:

pNRCZT1_0: 4247
pNRCZT1: 1135
pNRCZT2: 3899
pNRCZT3: 525
pNRCZT4: 109
pNRCZT5: 45
pNRCZT6: 25
pNRCZT7: 6
pNRCZT8: 5
pNRCZT9: 3
pNRCZT10: 1

This demonstrates that ALSs lead to somewhat shorter solution paths. I believe this is simply because an ALS typically includes more weak link paths than a standard nrc candidate. Also, given the resolution power of braids and with the advent of whips, I am uncertain that ALS groups offer additional solutions.

Cheers,
Paul
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Postby denis_berthier » Mon Apr 27, 2009 7:41 am

PIsaacson wrote:This demonstrates that ALSs lead to somewhat shorter solution paths. I believe this is simply because an ALS typically includes more weak link paths than a standard nrc candidate.

Maybe, the length of a chain should be increased by 2/3/4 instead of 1 when a Pair/Triplet/Quad is inserted? That'd be consistent with the fact that Pairs are chains of length 2 and (most) Triplets/Quads are chains of length 3/4.

Anyway, I think it's worth giving a full example of a "zt promoted" naked, hidden or super-hidden subset within a whip.

PIsaacson wrote:Also, given the resolution power of braids and with the advent of whips, I am uncertain that ALS groups offer additional solutions.

It is interesting that nrczt-whips and supersymmetric ALS-chains have separately almost the same solving power (see my work with Mike, here http://forum.enjoysudoku.com/viewtopic.php?t=5591&postdays=0&postorder=asc&start=90 and next page) but their combination through the zt-ing principle (z/t promotion in your vocabulary) is much stronger, as shown by the tables at the end of this page : http://forum.enjoysudoku.com/viewtopic.php?t=6390&postdays=0&postorder=asc&start=30. And your super-engine would be able to solve very hard puzzles.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: Rating rules / Puzzles. Ordering the rules

Postby denis_berthier » Tue Nov 30, 2010 3:47 pm

The missing pages of this thread, lost due to the May 2009 crash, are available on my website:
http://www.carva.org/denis.berthier/HLS/SPF/RRP/index.html

Note 1: at the time I made this backup, it was for my private use only (I couldn't imagine the previous managers of the forum didn't make backups) and I didn't care about compatibility with browsers other than Safari on my Mac. It is therefore in the webarchive format. But it can be read most easily on Windows by first downloading Safari for Windows. For other ways of reading them, see the "real distribution of minimal puzzles" thread in the "general" section.

Note 2: I put these old Forum files online, as a historical reference, but a synthesis of all this has been available for a long time on my usual web pages for those who are not interested in details: http://www.carva.org/denis.berthier/HLS/index.html
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: Rating rules / Puzzles. Ordering the rules

Postby JasonLion » Sat Mar 26, 2011 12:43 pm

Thanks to denis_berthier we now have PDF files of posts made to this topic between June 2009 and December 2009.

Page 10
Page 11
Page 12
Page 13
Page 14
Page 15
Page 16
Page 17
Page 18
Page 19
Page 20
Page 21
Page 22
Page 23
Page 24
Page 25
Page 26
Page 27
Page 28
Page 29
Page 30
User avatar
JasonLion
2017 Supporter
 
Posts: 624
Joined: 25 October 2007
Location: Silver Spring, MD, USA

The real distribution of the W and SER ratings (reminder)

Postby denis_berthier » Mon Aug 12, 2013 5:00 am




The real distribution of the W and SER ratings (reminder)


In preparation of my next post, I recall here a few old results.

On page 29 of the pdf here (posts lost in the 2010 crash): http://forum.enjoysudoku.com/rating-rules-puzzles-ordering-the-rules-t5995-143.html (Oct. 2009), I gave various results about the SER and the W classifications of minimal puzzles, based on the collection of 5,926,343 puzzles generated by the controlled-bias generator (for this generator, see the "real distribution" thread). (It used 279 full scans of the whole collection of complete grids.)
Part of these results is also available in:
- "Unbiased Statistics of a CSP - A Controlled-Bias Generator", International Joint Conferences on Computer, Information, Systems Sciences and Engineering (CISSE 09), December 4-12, 2009, Springer. Published as a chapter of the book Innovations in Computing Sciences and Software Engineering, Khaled Elleithy Editor, pp. 11-17, Springer, 2010, ISBN 97890481911133; pdf available on my website;
- chapter 6 of my book "Constraint Resolution Theories", Lulu.com Publishers (October 2011), ISBN 978-1-4478-6888-0;
- chapter 6 of my book "Pattern-Based Constraint Satisfaction and Logic Puzzles", Lulu.com Publishers (Novembre 2012), ISBN 978-1-291-20339-4 (pdf available here: http://arxiv.org/abs/1304.1628).

Remember that, for any fixed number of clues, the controlled-bias generator, when it uses an integer number of full scans of the whole set of complete grids (as obtained by gsf's compressed collection), is unbiased. As a result, each row of the the tables in sections 1 and 2 below gives both the controlled-bias and the real values for the n-clue SER or W rating. Only the global mean values and standard deviations have to be computed differently (without or with the correction coefficients associated with the controlled-bias generator).

In this post, I use my updated notations: W replaces the old NRCZT name.
As whips (or any of the other chain patterns I introduced) are meaningful in any Constraint Satisfaction Problem (CSP), with or with out any grid structure, NRCZT was too Sudoku-specific (nrc refers to number, row, column).
The W rating of a puzzle P is the smallest n such that P can be solved by whips of maximum length n.

In the next post, I'll show how the dependency of the SER or W mean for a fixed number of clues wrt the number of clues can be extended, using puzzles generated by Afmob for extreme (low or high) numbers of clues.
Notice that these extensions cannot have any impact on the global W or SER distributions recalled in section 3 (all numbers of clues confused), as they concern only an extremely low proportion of puzzles.


____________________________________________________________________

1) HOW THE SER REALLY DEPENDS ON THE NUMBER OF CLUES
____________________________________________________________________

In this section, I'm using only 1,380,962 puzzles (corresponding to 65 full scans of gsf's collection) of the total collection, because computing the SER is too long.

Code: Select all
#clues  #instances       mean(SER)       standard-deviation(SER)
19      0                                   
20      0                                   
21      41               3.56 (*)        2.01 (*)
22      1,526            3.15            2.16
23      25,884           3.35            2.24
24      163,694          3.61            2.36
25      422,451          3.96            2.47
26      467,047          4.40            2.54
27      234,963          4.93            2.53
28      57,615           5.47            2.44
29      7,243            6.07            2.19
30      481              6.76            1.71
31      16               5.79 (*)        2.34 (*)
32      1                7.3 (*)         (*)
all     1,380,962                     

(*) values based on small samples are not meaningful.


Which gives:

Code: Select all
controlled-bias mean(SER) = 4.29    controlled-bias standard-deviation(SER) = 2.48

real mean(SER) = 4.73               real standard-deviation(SER) = 2.49

correlation coefficient #clues vs SER = 0.19


The real correlation coefficient #clues vs SER (= 0.19) is a little higher than for the top-down generator (0.12) but it remains too small to allow any predictions of (SER) complexity given the number of clues.



____________________________________________________________________

2) HOW THE W RATING REALLY DEPENDS ON THE NUMBER OF CLUES
____________________________________________________________________

Code: Select all
#clues  #instances       mean(W)         standard-deviation(W)
19      0                                   
20      2                1.95 (*)        1.05 (*)         
21      164              1.52 (*)        0.93 (*)
22      6,651            1.64            1.12
23      110,103          1.73            1.18
24      704,089          1.86            1.25
25      1,814,413        2.04            1.32
26      2,002,349        2.27            1.38
27      1,007,700        2.55            1.42
28      247,259          2.85            1.42
29      31,449           3.16            1.37
30      2088             3.47            1.29
31      74               3.57            1.23 (*)
32      2                4.5 (*)         0.5 (*)
all     5,926,343                     

(*) values based on small samples are not meaningful.


Which gives:
Code: Select all
controlled-bias mean(W) = 2.217     controlled-bias standard-deviation(W) = 1.35

real mean(W) = 2.449                real standard-deviation(W) = 1.39

correlation coefficient #clues vs W = 0.19

The real correlation coefficient #clues vs W (= 0.19) is a little higher than for the top-down generator (0.115) but it remains too small to allow any predictions of (W) complexity given the number of clues.



____________________________________________________________________

3) THE REAL W COMPLEXITY DISTRIBUTION OF MINIMAL PUZZLES
____________________________________________________________________


The 5,926,343 minimal puzzles produced by the controlled-bias generator give the following statistics for the real W complexity distribution of minimal puzzles.

It is also interesting to compare them with the complexity distribution of minimal puzzles produced by different types of generators.
Code: Select all

                   W1_0     W1      W2      W3      W4      W5     W6     W7     W8      W9      W10      W11      W12       >W12
bottom-up          46.27    13.32   12.36   15.17   10.18   1.98   0.49   0.19   0.020   0.010   0 *      0.01 *   0 *       0 *
top-down           41.76    12.06   13.84   16.86   12.29   2.42   0.55   0.15   0.047   0.013   3.8e-3   1.5e-3   9.0e-4    2.0e-04 *
controlled-bias    35.08    9.82    13.05   20.03   17.37   3.56   0.79   0.21   0.055   0.015   4.4e-3   1.2e-3   3.2e-4    2.0e-04 *
real               29.17    8.44    12.61   22.26   21.39   4.67   1.07   0.29   0.072   0.020   5.5e-3   1.5e-3   3.4e-4    2.0e-04 *
_______________________________________________________________________________________________________________________________________
* values based on a small sub-sample are not reliable.



These distributions show very clearly the complexity bias of the three kinds of generators.
All these distributions have the same two modes, at W1_0 and at W3, as the real distribution.
It can be seen that when we move from bottom-up to top-down to controlled-bias to real, the mass of the distribution moves progressively to the right.
This displacement towards higher complexity occurs mainly at the first W-levels, after which it is only very slight.

In any cases:
- more than 98.5% of the puzzles can be solved with whips of maximal length 5;
- more than 99% of the puzzles can be solved with whips of maximal length 7;
- more than 99.9% of the puzzles can be solved with whips of maximal length 9;
- all the puzzles can be solved with whips.


A better presentation (with html tables) of these results (and much more) has been available on my website since that time http://carva.org/denis.berthier/Classification/index.html (see section 3).
There, I also wrote, in substance, that there seems to be a (non absolute) barrier of complexity such that, when the number of clues (n) increases:
- the n-clue mean (SER or W) rating increases;
- the proportion of puzzles with (SER or W) rating away from the n-clue mean (SER or W) rating increases;
but
- the proportion of puzzles with (SER or W) rating far below the n-clue mean (SER or W) rating increases;
- the proportion of puzzles with (SER or W) rating far above the n-clue mean (SER or W) rating decreases.
Graphically, the n-clue (SER or W) rating distribution looks like a wave; when n increases, the wave moves to the right, with a longer left tail and a steeper right front.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

How the mean W and SER ratings vary for extreme numbers of c

Postby denis_berthier » Mon Aug 12, 2013 3:01 pm




How the mean W and SER ratings vary for extreme numbers of clues


As announced in the previous post, the above results can now be completed using puzzles generated by Afmob's fast implementation of Red Ed's subset/superset method *.
The status of the following results is different from the above: being computed from samples that are:
- correlated (due to the method),
- possibly biased (due to the number of complete grids used),
- relatively small.

In particular, the standard deviation is unknown.

Anyway, it is interesting to see how the trend that could be observed in the previous post seems to extend to the extreme values.

(*) For two sets of puzzles, the origin is different:
- the 17s are the whole collection of 49,712 17s (bias 0);
- the 38s are the whole collection of known 38s (unknown bias).


____________________________________________________________________

1) HOW THE MEAN SER VARIES WITH EXTREME NUMBERS OF CLUES
____________________________________________________________________


Code: Select all
#clues  #instances      mean(SER)       
17      49,152          2.60                   
18      814             2.64                 
19      59              2.59  (*)                 
20      1000            2.81                   
21      1001            2.95


31      1000            6.77     
32      1002            7.17     
33      1000            7.40     
34      91              7.44 (*)     

38      121038          7.98 (**)     
39      336             7.95       

(*) small sample, result not reliable
(**) ad hoc sample, result not reliable




____________________________________________________________________

2) HOW THE MEAN W RATING VARIES WITH EXTREME NUMBERS OF CLUES
____________________________________________________________________

Code: Select all
#clues  #instances      mean(W)       
17      49,152          1.30                   
18      814             1.41                 
19      59              1.45 (*)                   
20      1000            1.45                   
21      1001            1.50


31      1000            3.65     
32      1002            3.89     
33      1000            4.02     
34      91              4.49 (*)     

(*) small sample, result not reliable
Last edited by denis_berthier on Mon Aug 12, 2013 4:29 pm, edited 1 time in total.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: Rating rules / Puzzles. Ordering the rules

Postby denis_berthier » Mon Aug 12, 2013 3:02 pm

[Reserved]
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: Rating rules / Puzzles. Ordering the rules

Postby enxio27 » Mon Aug 12, 2013 3:43 pm

Denis, it's nice to see you back!

I'm following this thread with some interest, although on a much lower level than the rest of you. My main concern is getting a consistent rating of the puzzles in my (now vast) collection, with a view to working my way up the difficulty scale. I'm a pencil-and-paper solver, although for puzzles that (for me) require pencil marks, I let the software do the pencil marks for me (otherwise I make too many fun-killing mistakes before I ever start the puzzle itself). I am hoping to use y'all's findings in this thread to help me order the rules in my puzzle-rating software (currently Ruud's Sudocue) to give me a fairly accurate ranking of my puzzle collection.

I know this is something of a can of worms, but I'm curious as to the general consensus on which techniques would fall into the category of "guessing" vs. pure (human) logic.
User avatar
enxio27
 
Posts: 351
Joined: 13 November 2007

Re: Rating rules / Puzzles. Ordering the rules

Postby denis_berthier » Mon Aug 12, 2013 4:40 pm

Hi Enxio27,

enxio27 wrote:I am hoping to use y'all's findings in this thread to help me order the rules in my puzzle-rating software (currently Ruud's Sudocue) to give me a fairly accurate ranking of my puzzle collection.

I'm not sure this thread can be of any use to rate puzzles according to Sudocue ranking. But I've never used Sudocue (it runs only on Windows).

enxio27 wrote:I know this is something of a can of worms, but I'm curious as to the general consensus on which techniques would fall into the category of "guessing" vs. pure (human) logic.

A can of worms and quite OOT here. In 3.5 words: there's no consensus.

In my view, rating can't be a goal in and of itself and there can't be any unique rating system.
You rate puzzles with some view of how you should solve them.
If this view is pattern-based, then you rate them according to some set of resolution rules.
Even in this case, there are many possible ratings, all logically grounded (e.g. based on whips, g-whips, Subsets, g-Subets, ...).
In this thread, I've shown how some of them are (statistically) related.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Previous

Return to Advanced solving techniques