Distribution of clues in the grey zone
This question arose in another thread.
I haven't defined the grey zone in a very precise way. Depending on how I make it more precise, SER >= 9.0 or W>=9, different calculations can be done but, as long as the number of clues is concerned, they don't lead to significantly different results.
If I consider the whole collection of 5,926,343 puzzles generated by the controlled-bias generator, 1258 have their W rating >= 9.
The raw distribution of clues for them is as follows:
- Code: Select all
nb-clues nb-instances %
19 0
20 0
21 0
22 0
23 22 1.7
24 106 8.4
25 306 24.3
26 415 33.0
27 288 22.9
28 102 8.1
29 17 1.4
30 2 0.2
31 0
32 0
33 0
34 0
35 0
mean= 25.97
standard-deviation= 1.20
If I consider only the the first 3,037,717 for which I had computed the SER, 5615 have their SER >= 9.0. The raw distribution of clues for them is:
- Code: Select all
nb-clues nb-instances %
19 0
20 0
21 0
22 2 0.04
23 46 0.8
24 416 7.4
25 1319 23.5
26 1915 34.1
27 1380 24.6
28 440 7.83
29 90 1.6
30 7 0.1
31 0
32 0
33 0
34 0
35 0
mean= 26.05
standard-deviation= 1.15
For comparison, I recall the data for the whole cb-sample (see p.43 of the pdf in the "real distribution" thread):
- Code: Select all
nb-clues nb-instances %
20 2 3.7e-05
21 164 0.0027
22 6,651 0.1124
23 110,103 1.858
24 704,089 11.88
25 1,814,413 30.62
26 2,002,349 33.79
27 1,007,700 17.00
28 247,259 4.172
29 31,449 0.531
30 2,088 0.0352
31 74 0.00125
32 2 3.37e-05
mean= 25.67
standard-deviation= 1.12