Bands and low-clue puzzles

Everything about Sudoku that doesn't fit in one of the other sections

Re: Bands and low-clue puzzles

champagne wrote:PS the number of UAs of any size with no subset is very big in a solution grid. I suspect several thousands in average, but may be somebody have a better approach of the count.

Here are some "average # per grid" estimates (with error estimates), from long ago.
The "(known) grids with 17s" list, was slightly smaller, at the time.

Code: Select all
`UA size |     all grids      |   grids with 17s--------+--------------------+-------------------    4   | 1.1580e+1 ( 0.04%) | 7.1160e+0 ( 0.03%)    6   | 1.5338e+1 ( 0.06%) | 1.4746e+1 ( 0.03%)    8   | 2.2038e+1 ( 0.07%) | 2.2646e+1 ( 0.04%)    9   | 6.0455e+0 ( 0.16%) | 7.2594e+0 ( 0.08%)   10   | 5.8380e+1 ( 0.08%) | 6.7124e+1 ( 0.04%)   11   | 2.9499e+1 ( 0.12%) | 3.8499e+1 ( 0.06%)   12   | 1.3667e+2 ( 0.09%) | 1.6819e+2 ( 0.05%)   13   | 1.0483e+2 ( 0.11%) | 1.3118e+2 ( 0.06%)   14   | 3.6172e+2 ( 0.09%) | 4.4636e+2 ( 0.05%)   15   | 3.9133e+2 ( 0.11%) | 5.0261e+2 ( 0.05%)   16   | 1.0498e+3 ( 0.10%) | 1.3534e+3 ( 0.05%)   17   | 1.4800e+3 ( 0.10%) | 1.9305e+3 ( 0.05%)   18   | 3.4658e+3 ( 0.10%) | 4.5559e+3 ( 0.05%)   19   | 5.5821e+3 ( 0.10%) | 7.4088e+3 ( 0.05%)   20   | 1.1778e+4 ( 0.10%) | 1.5759e+4 ( 0.05%)   21   | 2.0773e+4 ( 0.11%) | 2.8016e+4 ( 0.05%)   22   | 4.1781e+4 ( 0.11%) | 5.6895e+4 ( 0.06%)   23   | 7.7262e+4 ( 0.11%) | 1.0601e+5 ( 0.06%)   24   | 1.5191e+5 ( 0.12%) | 2.1048e+5 ( 0.06%)   25   | 2.8748e+5 ( 0.12%) | 4.0185e+5 ( 0.06%)   26   | 5.5831e+5 ( 0.13%) | 7.8740e+5 ( 0.07%)   27   | 1.0649e+6 ( 0.13%) | 1.5137e+6 ( 0.07%)   28   | 2.0469e+6 ( 0.14%) | 2.9377e+6 ( 0.07%)   29   | 3.8959e+6 ( 0.15%) | 5.6359e+6 ( 0.08%)   30   | 7.3881e+6 ( 0.16%) | 1.0791e+7 ( 0.08%)   31   | 1.3891e+7 ( 0.17%) | 2.0447e+7 ( 0.09%)   32   | 2.5872e+7 ( 0.18%) | 3.8398e+7 ( 0.09%)   33   | 4.7497e+7 ( 0.19%) | 7.1136e+7 ( 0.10%)   34   | 8.5945e+7 ( 0.21%) | 1.2982e+8 ( 0.11%)   35   | 1.5226e+8 ( 0.23%) | 2.3212e+8 ( 0.12%)   36   | 2.6388e+8 ( 0.25%) | 4.0553e+8 ( 0.13%)   37   | 4.4514e+8 ( 0.28%) | 6.9067e+8 ( 0.14%)   38   | 7.2814e+8 ( 0.32%) | 1.1396e+9 ( 0.16%)   39   | 1.1460e+9 ( 0.37%) | 1.8111e+9 ( 0.18%)   40   | 1.7355e+9 ( 0.43%) | 2.7683e+9 ( 0.21%)   41   | 2.5057e+9 ( 0.53%) | 4.0162e+9 ( 0.25%)   42   | 3.4120e+9 ( 0.68%) | 5.5462e+9 ( 0.32%)   43   | 4.3680e+9 ( 0.94%) | 7.1899e+9 ( 0.43%)   44   | 5.2410e+9 ( 1.36%) | 8.6178e+9 ( 0.62%)   45   | 5.8689e+9 ( 2.19%) | 9.8323e+9 ( 0.97%)   46   | 5.8592e+9 ( 3.86%) | 9.8494e+9 ( 1.70%)   47   | 5.2146e+9 ( 7.47%) | 8.7982e+9 ( 3.34%)   48   | 3.9547e+9 (17.54%) | 7.3022e+9 ( 7.37%)   49   | 3.3898e+9 (40.82%) | 6.0837e+9 (17.76%)`

The grids with 17's, have fewer UA4's and UA6's than average.
The situation is reversed, though, for UA8's and larger ... through size 49, at least.
blue

Posts: 700
Joined: 11 March 2013

Re: Bands and low-clue puzzles

Hi blue,

nice to see you back. Your data are far above my expectations, although I was sure to be on the safe side with "thousands".

Just a question

Here are some "average # per grid" estimates (with error estimates), from long ago.
The "(known) grids with 17s" list, was slightly smaller, at the time.

Did you in that task filter the UAs having a subset? Seen the huge number of UAs of size 49, I suspect that this is without any filter.
champagne
2017 Supporter

Posts: 6684
Joined: 02 August 2007
Location: France Brittany

Re: Bands and low-clue puzzles

Hi champagne,

Did you in that task filter the UAs having a subset? Seen the huge number of UAs of size 49, I suspect that this is without any filter.

Yes I did, if by that you mean that it only counted "minimal UA's": UA's having no proper subset that is also UA.
I remember being amazed by the numbers, too.

The code used a method that Red Ed invented, for estimating the average #'s of minimal puzzles of a given size.
It worked like this:
1. Choose a random grid and a size S subset of that grid.
2. Find all minimal UA's in that subset, and tally the results by size.
3. Do (1) and (2), some large number of times ... N.
4. Divide the final counts by N, and then multiply them by a size-dependent scale factor, (1 / P), where P is the probablity that a (UA) set of that size would be a subset of a randomly chosen subgrid of size S.
For the "all grids" column, N was 10,320,000 and S was 55.
For the size 49 entry, only 28 UA's of that size, were actually found.
For sizes 45 & 46, where the peak is, it was 13598 and 3771 sets found, respectively.
blue

Posts: 700
Joined: 11 March 2013

Re: Bands and low-clue puzzles

champagne wrote:
coloin wrote:Indeed ... but I am thinking more like if there are more UAs - then there is more chance of a 17 [paradoxically]

I have no evidence of that...

I see no paradox. Directly from definition of UA, it is the difference between the grid in question and another valid grid. So, every solution grid has exactly 6,670,903,752,021,072,936,959 minimal + non-minimal UA sets. Each small-sized UA makes many larger ones non-minimal, being a subset of them. Assuming the majority of 17-clue puzzles are already discovered it was proven that their solution grids trend to have less number of UA4 and UA6 (click). Ignoring the higher order effects (how large UA sets invalidate larger ones) we can assume that the grids with small number of short UA sets have large total number of minimal UA sets.

So far the bands don't help in discovery of new 17s.

A different approach is to partition grids in other dimension, checking whether N-digit UA sets regardless of size can help. Entire set of 2, 3, and 4-digit UA sets can be generated analysing N-digit templates, see http://forum.enjoysudoku.com/post211112.html#p211112 which suggests 2648603 subgrids for 4-digit UA analysis and eventual remapping to 119503485 4-templates. Statistics for 17 vs all solution grids can be collected even w/o UA generation, on the rookery/template basis.
dobrichev
2016 Supporter

Posts: 1622
Joined: 24 May 2010

Re: Bands and low-clue puzzles

Hi Blue,
blue wrote:Here are some "average # per grid" estimates (with error estimates), from long ago.

Could you do another run, this time counting the number of different values involved in the UA and respectively adding 8 columns with counters for 2 to 9 values respectively.
Of practical interest are UA with up to 20 cells and their distribution for up to 5 different values, so you can save time by generating less grids/UA than in previous exercise.
I think an exhaustive catalogue with 2 to 4-values UA is achievable and for particular grid all UA from the catalogue could be found fairly fast.
Another option is somehow to generate partial catalogue for short UA with 5 values.
This way testing a grid for 17-given puzzle could start with few thousands of UA for hitting, all obtained in fast and deterministic way.
dobrichev
2016 Supporter

Posts: 1622
Joined: 24 May 2010

Re: Bands and low-clue puzzles

dobrichev wrote:Hi Blue,
blue wrote:Here are some "average # per grid" estimates (with error estimates), from long ago.

Could you do another run, this time counting the number of different values involved in the UA and respectively adding 8 columns with counters for 2 to 9 values respectively.
Of practical interest are UA with up to 20 cells and their distribution for up to 5 different values, so you can save time by generating less grids/UA than in previous exercise.
I think an exhaustive catalogue with 2 to 4-values UA is achievable and for particular grid all UA from the catalogue could be found fairly fast.
Another option is somehow to generate partial catalogue for short UA with 5 values.
This way testing a grid for 17-given puzzle could start with few thousands of UA for hitting, all obtained in fast and deterministic way.

I see that we are looking for similar new ways to search 17s.

I assume that "values" means digits 1 to 9.

Looking at the table prepared by blue, I was thinking of a cut off at the size 18, pushing to six possible digits. I am drafting a special brute force with such limitations to see if it is possible in some milliseconds (accepting if necessary missing UAs and/or not seen subsets).

Next is how to do the search. Here again, I have something in mind to try, using the bands properties.
champagne
2017 Supporter

Posts: 6684
Joined: 02 August 2007
Location: France Brittany

Re: Bands and low-clue puzzles

Yes, values == digits from 1 to 9.
I am waiting for good news from you when ready.
dobrichev
2016 Supporter

Posts: 1622
Joined: 24 May 2010

Re: Bands and low-clue puzzles

dobrichev wrote:Yes, values == digits from 1 to 9.
I am waiting for good news from you when ready.

May be some indications after the first tests.

Finding the full mc Garry set of UAs (<=12 cells) takes about 15 ms in my test (~280 UAs for each of the 2 puzzles)

Findings all "up to 4 digits" UAs takes about 7 ms for the same puzzles and gives around 700 UAs of size <= 18.

Looking for "5 digits" UAs is already much more expensive with the process I tested (more than 100 ms) and, compared to the Mc Garry set, we are still missing some 6 digits UAs of size 12. Depending on the shortcuts used, the number of UAs was in the range 1200_3500.

The test I made earlier did not pass the 3 digits UAs in addition to the Mc Garry set, and the total number of UAs was in the range 500-768 UAs. This was enough to influence significantly the process.

Improving the process for 5 digits can be done. Contrary to my previous statement, I think that the 6 digits analysis is of null interest. Using the Mc Garyy generation for the size 12 will be better. (normally, the 4 digits produces all UAs of smaller size).

The next test will be done using the 4 digits generation as a basis and adding the McGarry missing Uas of size >10
champagne
2017 Supporter

Posts: 6684
Joined: 02 August 2007
Location: France Brittany

Re: Bands and low-clue puzzles

dobrichev wrote:Hi Blue,
blue wrote:Here are some "average # per grid" estimates (with error estimates), from long ago.

Could you do another run, this time counting the number of different values involved in the UA and respectively adding 8 columns with counters for 2 to 9 values respectively.
Of practical interest are UA with up to 20 cells and their distribution for up to 5 different values, so you can save time by generating less grids/UA than in previous exercise.

Here's the output from a short run.
The numbers aren't very precise, but they're probably OK for your interests (?)
[ The % numbers are +/- 95% confidence intervals. ]

Code: Select all
`sz    total               2d              3d              4d              5d        ------------------------------------------------------------------------------------ 4    11.58( 0.02%)    11.58( 0.02%) 6    15.34( 0.04%)     6.72( 0.06%)    8.62( 0.05%) 8    22.03( 0.09%)     4.07( 0.20%)    9.11( 0.14%)    8.85( 0.14%) 9     6.06( 0.28%)                     2.38( 0.44%)    3.68( 0.36%)10    58.32( 0.16%)     3.52( 0.58%)   11.34( 0.34%)   26.86( 0.23%)   16.61( 0.29%)11    29.50( 0.36%)                     4.19( 0.94%)   13.80( 0.53%)   11.51( 0.57%)12   136.74( 0.30%)     3.65( 1.67%)   13.18( 0.91%)   44.82( 0.50%)   55.22( 0.46%)13   104.85( 0.57%)                     7.44( 2.11%)   31.69( 1.03%)   45.33( 0.87%)14   362.55( 0.55%)     5.99( 4.04%)   18.15( 2.37%)   78.87( 1.15%)  138.83( 0.88%)15   387.98( 0.94%)                    12.63( 5.16%)   75.79( 2.11%)  153.75( 1.48%)16  1048.25( 1.05%)                    21.87( 7.02%)  144.45( 2.77%)  362.92( 1.75%)17  1483.89( 1.64%)                    16.89(14.94%)  163.34( 4.81%)  505.76( 2.77%)18  3417.05( 2.03%)    18.45(26.35%)   42.61(18.04%)  263.03( 7.17%)  958.85( 3.76%)19  5596.17( 3.05%)                    27.35(41.72%)  293.42(12.92%) 1380.08( 6.09%)20 12010.91( 4.13%)                    81.90(52.81%)  525.15(19.61%) 2168.03( 9.62%)------------------------------------------------------------------------------------sz       6d              7d              8d              9d------------------------------------------------------------------- 4   6   8   9  10  11  12    19.88( 0.76%)13    20.37( 1.29%)14    94.31( 1.07%)   26.42( 2.02%)15   115.44( 1.72%)   30.37( 3.34%)16   355.64( 1.79%)  145.91( 2.80%)   17.46( 7.94%)17   536.62( 2.68%)  230.42( 4.11%)   30.86(10.99%)18  1248.05( 3.33%)  717.30( 4.37%)  152.32( 9.46%)   16.44(27.90%)19  2198.19( 4.84%) 1351.49( 6.08%)  329.48(12.79%)   16.16(54.22%)20  4634.78( 6.51%) 3550.76( 7.47%)  929.85(14.25%)  120.45(40.72%)-------------------------------------------------------------------`
blue

Posts: 700
Joined: 11 March 2013

Re: Bands and low-clue puzzles

blue wrote:The numbers aren't very precise, but they're probably OK for your interests (?)

Hi Blue,

I let mladen answer for its own interest. On my side, the trade off is the following :

In the search of low clues puzzles in a grid, you suffer a high penalty when you get short in UAs. (no more UA to hit in your table);

Adding UAs at the start and processing them in the same way as Mc Garry processed UAs of higher degree reduces the count for such situations.

So the target is not to add all UAs of size <= "x" but to select UAs not to costly to add having at the end a positive effect on the process.

From the test I run, I am thinking of a relevant table in the range 1000-2000 UAs
champagne
2017 Supporter

Posts: 6684
Joined: 02 August 2007
Location: France Brittany

Re: Bands and low-clue puzzles

Interesting ....
Is there value in distinguishing UAs which envolve all 9 boxes
depending on methods .... UAs whch just envolve 6 boxes / 2 bands might be less relevant ...
coloin

Posts: 1738
Joined: 05 May 2005

Re: Bands and low-clue puzzles

coloin wrote:Interesting ....
depending on methods .... UAs whch just envolve 6 boxes / 2 bands might be less relevant ...

Hi coloin,

depending on methods is the right word.
I start from scratch (but with many pieces of code) a new test on that solution grid

Code: Select all
`123456789456789123789123456214368597867945231935217864348672915591834672672591348......7.........23.891.....2..........79......3.....64..8...91.....34............ existing 17`

I intend to do the following

solve bands 1+2
then solve stacks 1+2

then ??? depending on the results, but possibly
solve band 3
solve stack 3
solve the full grid.

here, the band 3 is the band with a possibility of 2 clues, but we know that for a 17, the minimum is 3 clues,
so the solution for bands 1+2 can not have more than 17-3= 14 clues.
Generally speaking, the maximum is in the range 17-3=14 to 17-6=11 (morphing the solution grid, we can always find a 4 clues minimum)

Applying Mc Garry blue prints, I find 277 UAs of size <= 12.
adding all 2-4 digits UAs of size <=18 and 5 digits UAs of size <=18 for bands 12, I find 879 UAs with the following split

Code: Select all
`        Mc Garry <=12  prepared setBands  12     53        139   stacks 12     42         67  (not in bands 12)band  3       18         18  (not in bands 12 "could use the full list")stack 3       11         11  (id)others       153        644   box9 not in band/stack 3             277        879    `

That task was done in 13 milliseconds, giving room for more UAs if necessary (and for optimization of the solution grid through morphing)
In that case, the deeper search is done for bands 1-2
And as you can see, most of the UAs have cells in box 9 + {boxes 1245}, but this is likely specific to that grid
champagne
2017 Supporter

Posts: 6684
Joined: 02 August 2007
Location: France Brittany

Re: Bands and low-clue puzzles

champagne wrote:That task was done in 13 milliseconds,]

Sounds promising
However ... I realized that my comment on a UA envolving 9 boxes , trivially needs minimum 18 clues ...and these are 2-clue UAs
Maybe UAs from a specific 6 box patterns would focus things eg B124689 x6 ?

I dont think there can be a UA which solely comes from B12459 !!!!

C
coloin

Posts: 1738
Joined: 05 May 2005

Re: Bands and low-clue puzzles

Hi coloin,

some data that could help for your topic
champagne wrote:And as you can see, most of the UAs have cells in box 9 + {boxes 1245}, but this is likely specific to that grid

This does not seem correct. I made a quick test to see if morphing the puzzle I could reduce significantly the count, Nothing came. The reason is likely (partially) in the table here below where you find the distribution of the UAs of my test solution grid per number of boxes hit.

Code: Select all
`2   193   934   1115   1996   2157   1598   489   35`

I made a second analysis summarized in that table

Code: Select all
`0   436   217   6531   457   203   6602   483   187   6703   523   157   6804   519   155   6745   561   132   6936   524   104   6287   545   86   6318   571   73   644`

Here, the last item is the split of the 644 "others" in a better way

first column is the box index (assuming the grid morphed to have it as box9)
the second item is the count of UAs of that lot hitting box9
the third column is the count of UAs hitting both band 3 and stack 3, but not box9

The last column is just the sum of columns 2 and 3, giving the expected count for the lot previously called "others".

Morphing would have a small effect on that side, but could be of interest on 2 points:

Having a higher minimum of clues for band 3
Having a better lot of UAs to solve bands 1+2 (still to investigate)
champagne
2017 Supporter

Posts: 6684
Joined: 02 August 2007
Location: France Brittany

Re: Bands and low-clue puzzles

blue wrote:Here's the output from a short run.

Thank you, Blue!
If you choose to generate UA of size up to M cells and N different digits, then on average grid you will work with the given below number of UA sets.
The percentage is the obtained UA compared to all UA of size below or equal to the given number of cells. I.e. the difference comes from the UA sets of same or less size but with more digits involved.
Code: Select all
`cells\digits Total   <=2       <=3        <=4        <=5          <=6          <=7         <=8<=4          12    12 (100%)<=6          27    18 (68%)  27 (100%)<=8          49    22 (46%)  40 (82%)   49 (100%)<=9          55    22 (41%)  42 (77%)   55 (100%)<=10        113    26 (23%)  57 (51%)   97 (85%)   113 (100%)<=11        143    26 (18%)  62 (43%)  115 (80%)   143 (100%)<=12        280    30 (11%)  78 (28%)  176 (63%)   260 (93%)    280 (100%)<=13        384    30 (8%)   86 (22%)  216 (56%)   344 (90%)    384 (100%)<=14        747    36 (5%)  110 (15%)  319 (43%)   586 (78%)    721 (96%)    747 (100%)<=15       1135    36 (3%)  123 (11%)  407 (36%)   828 (73%)   1078 (95%)   1135 (100%)<=16       2183    36 (2%)  144 (7%)   573 (26%)  1357 (62%)   1963 (90%)   2166 (99%)  2183 (100%)<=17       3667    36 (1%)  161 (4%)   753 (21%)  2043 (56%)   3186 (87%)   3619 (99%)  3667 (100%)<=18       7084    54 (1%)  222 (3%)  1078 (15%)  3326 (47%)   5717 (81%)   6867 (97%)  7068 (100%)<=19      12680    54 (0%)  250 (2%)  1398 (11%)  5027 (40%)   9616 (76%)  12118 (96%) 12648 (100%)<=20      24691    54 (0%)  332 (1%)  2005 (8%)   7802 (32%)  17026 (69%)  23078 (93%) 24538 (99%)`

To be closer to the reality in the context of low-clue puzzle generation, one should weight the UA depending of their size. For example, working with 1078 UA having cells <=18 and digits <= 4, you work with only 15% of all UA of size <= 18, but these 15% include significant part of the most valuable short-sized UA.

Thinking about earlier dead-end branch identification, the UA of size 4 and 6 are hit with the first few clues and are out of interest.(Is this true for an average grid or the number of exceptions is ~100% ?)
To identify a dead-end branch when placing 13th clue in 17s generation, you must check whether it hits all alive UA composites consisting of 5 disjoint UA. In order to have chance not to be hit with the first 13 clues, their size must be at most 81-13=68 cells, or they must be assembled from 5 UA of average size of 13.6 cells. So, in such a composite, all UA of size >= 14 must be compensated by one or more shorter UA of size 8 to 13.
For 14th clue the average UA in the largest composite is 67/4 = 16.75 cells; for 15th and 16th they are 22 and 32.5 cells respectively.
dobrichev
2016 Supporter

Posts: 1622
Joined: 24 May 2010

PreviousNext