The real distribution of the W and SER ratings (reminder)In preparation of my next post, I recall here a few old results.
On page 29 of the pdf here (posts lost in the 2010 crash):
http://forum.enjoysudoku.com/rating-rules-puzzles-ordering-the-rules-t5995-143.html (Oct. 2009), I gave various results about the SER and the W classifications of minimal puzzles, based on the collection of 5,926,343 puzzles generated by the controlled-bias generator (for this generator, see the "real distribution" thread). (It used 279 full scans of the whole collection of complete grids.)
Part of these results is also available in:
- "Unbiased Statistics of a CSP - A Controlled-Bias Generator", International Joint Conferences on Computer, Information, Systems Sciences and Engineering (CISSE 09), December 4-12, 2009, Springer. Published as a chapter of the book Innovations in Computing Sciences and Software Engineering, Khaled Elleithy Editor, pp. 11-17, Springer, 2010, ISBN 97890481911133; pdf available on my website;
- chapter 6 of my book "Constraint Resolution Theories", Lulu.com Publishers (October 2011), ISBN 978-1-4478-6888-0;
- chapter 6 of my book "Pattern-Based Constraint Satisfaction and Logic Puzzles", Lulu.com Publishers (Novembre 2012), ISBN 978-1-291-20339-4 (pdf available here:
http://arxiv.org/abs/1304.1628).
Remember that, for any fixed number of clues, the controlled-bias generator, when it uses an integer number of full scans of the whole set of complete grids (as obtained by gsf's compressed collection), is unbiased. As a result, each row of the the tables in sections 1 and 2 below gives both the controlled-bias and the real values for the n-clue SER or W rating. Only the global mean values and standard deviations have to be computed differently (without or with the correction coefficients associated with the controlled-bias generator).
In this post, I use my updated notations: W replaces the old NRCZT name.
As whips (or any of the other chain patterns I introduced) are meaningful in any Constraint Satisfaction Problem (CSP), with or with out any grid structure, NRCZT was too Sudoku-specific (nrc refers to number, row, column).
The W rating of a puzzle P is the smallest n such that P can be solved by whips of maximum length n.
In the next post, I'll show how the dependency of the SER or W mean for a fixed number of clues wrt the number of clues can be extended, using puzzles generated by
Afmob for extreme (low or high) numbers of clues.
Notice that these extensions cannot have any impact on the global W or SER distributions recalled in section 3 (all numbers of clues confused), as they concern only an extremely low proportion of puzzles.
____________________________________________________________________
1) HOW THE SER REALLY DEPENDS ON THE NUMBER OF CLUES
____________________________________________________________________
In this section, I'm using only 1,380,962 puzzles (corresponding to 65 full scans of gsf's collection) of the total collection, because computing the SER is too long.
- Code: Select all
#clues #instances mean(SER) standard-deviation(SER)
19 0
20 0
21 41 3.56 (*) 2.01 (*)
22 1,526 3.15 2.16
23 25,884 3.35 2.24
24 163,694 3.61 2.36
25 422,451 3.96 2.47
26 467,047 4.40 2.54
27 234,963 4.93 2.53
28 57,615 5.47 2.44
29 7,243 6.07 2.19
30 481 6.76 1.71
31 16 5.79 (*) 2.34 (*)
32 1 7.3 (*) (*)
all 1,380,962
(*) values based on small samples are not meaningful.
Which gives:
- Code: Select all
controlled-bias mean(SER) = 4.29 controlled-bias standard-deviation(SER) = 2.48
real mean(SER) = 4.73 real standard-deviation(SER) = 2.49
correlation coefficient #clues vs SER = 0.19
The real correlation coefficient #clues vs SER (= 0.19) is a little higher than for the top-down generator (0.12) but it remains too small to allow any predictions of (SER) complexity given the number of clues.
____________________________________________________________________
2) HOW THE W RATING REALLY DEPENDS ON THE NUMBER OF CLUES
____________________________________________________________________
- Code: Select all
#clues #instances mean(W) standard-deviation(W)
19 0
20 2 1.95 (*) 1.05 (*)
21 164 1.52 (*) 0.93 (*)
22 6,651 1.64 1.12
23 110,103 1.73 1.18
24 704,089 1.86 1.25
25 1,814,413 2.04 1.32
26 2,002,349 2.27 1.38
27 1,007,700 2.55 1.42
28 247,259 2.85 1.42
29 31,449 3.16 1.37
30 2088 3.47 1.29
31 74 3.57 1.23 (*)
32 2 4.5 (*) 0.5 (*)
all 5,926,343
(*) values based on small samples are not meaningful.
Which gives:
- Code: Select all
controlled-bias mean(W) = 2.217 controlled-bias standard-deviation(W) = 1.35
real mean(W) = 2.449 real standard-deviation(W) = 1.39
correlation coefficient #clues vs W = 0.19
The real correlation coefficient #clues vs W (= 0.19) is a little higher than for the top-down generator (0.115) but it remains too small to allow any predictions of (W) complexity given the number of clues.
____________________________________________________________________
3) THE REAL W COMPLEXITY DISTRIBUTION OF MINIMAL PUZZLES
____________________________________________________________________
The 5,926,343 minimal puzzles produced by the controlled-bias generator give the following statistics for the real W complexity distribution of minimal puzzles.
It is also interesting to compare them with the complexity distribution of minimal puzzles produced by different types of generators.
- Code: Select all
W1_0 W1 W2 W3 W4 W5 W6 W7 W8 W9 W10 W11 W12 >W12
bottom-up 46.27 13.32 12.36 15.17 10.18 1.98 0.49 0.19 0.020 0.010 0 * 0.01 * 0 * 0 *
top-down 41.76 12.06 13.84 16.86 12.29 2.42 0.55 0.15 0.047 0.013 3.8e-3 1.5e-3 9.0e-4 2.0e-04 *
controlled-bias 35.08 9.82 13.05 20.03 17.37 3.56 0.79 0.21 0.055 0.015 4.4e-3 1.2e-3 3.2e-4 2.0e-04 *
real 29.17 8.44 12.61 22.26 21.39 4.67 1.07 0.29 0.072 0.020 5.5e-3 1.5e-3 3.4e-4 2.0e-04 *
_______________________________________________________________________________________________________________________________________
* values based on a small sub-sample are not reliable.
These distributions show very clearly the complexity bias of the three kinds of generators.
All these distributions have the same two modes, at W1_0 and at W3, as the real distribution.
It can be seen that when we move from bottom-up to top-down to controlled-bias to real, the mass of the distribution moves progressively to the right.
This displacement towards higher complexity occurs mainly at the first W-levels, after which it is only very slight.
In any cases:
- more than 98.5% of the puzzles can be solved with whips of maximal length 5;
- more than 99% of the puzzles can be solved with whips of maximal length 7;
- more than 99.9% of the puzzles can be solved with whips of maximal length 9;
- all the puzzles can be solved with whips.
A better presentation (with html tables) of these results (and much more) has been available on my website since that time
http://carva.org/denis.berthier/Classification/index.html (see section 3).
There, I also wrote, in substance, that there seems to be a (non absolute) barrier of complexity such that, when the number of clues (n) increases:
- the n-clue mean (SER or W) rating increases;
- the proportion of puzzles with (SER or W) rating away from the n-clue mean (SER or W) rating increases;
but
- the proportion of puzzles with (SER or W) rating far below the n-clue mean (SER or W) rating increases;
- the proportion of puzzles with (SER or W) rating far above the n-clue mean (SER or W) rating decreases.
Graphically, the n-clue (SER or W) rating distribution looks like a wave; when n increases, the wave moves to the right, with a longer left tail and a steeper right front.