Low/Hi Clue Thresholds

Everything about Sudoku that doesn't fit in one of the other sections

Re: Low/Hi Clue Thresholds

Postby blue » Sat Aug 31, 2019 1:51 pm

coloin wrote:I'm uploading my next batches soon .... and we will see if we are just generating puzzles in grids with a plethora of 19C ..... hope not !
If this is so and there are many grids out there with a paucity of 19C then maybe we need to somehow change our method of attack - as ever the sudoku space doesnt make it easy

Here are the 19C counts for 99 random grids.
(The 100th grid was an outlier, with 20207 19C puzzles)

Code: Select all
                             below average (72) | (27) above avg.
------------------------------------------------+-----------------
   53   233   369   573   702   904  1264  1593 | 2237  3781  6102
   69   235   456   578   716   959  1270  1599 | 2296  4012  6133
   82   237   479   588   718   994  1319  1608 | 2567  4110  6521
  105   310   480   607   721  1008  1359  1633 | 2632  4364  6803
  122   314   485   613   758  1036  1413  1659 | 2666  4532  6954
  124   342   488   613   768  1076  1429  1815 | 2735  5164  7156
  142   354   496   634   813  1145  1446  1903 | 2977  5318  7339
  177   361   505   655   835  1146  1463  1951 | 3250  5812  8782
  213   366   543   671   892  1228  1578  2170 | 3417  6078  9409

@Mathimagics: After the next update, please PM a list of 100 random grids without a known 19C (or smaller).
blue
 
Posts: 979
Joined: 11 March 2013

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Sat Aug 31, 2019 3:39 pm

Done!
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby blue » Sat Aug 31, 2019 6:18 pm

The 19C counts for the 100 grids from Mathimagics, are:

Code: Select all
    32   145   252   405   501   619   845  1035  1214  2007  2961  9907
    94   163   291   405   515   666   872  1060  1296  2069  2995
   106   170   303   412   528   678   881  1066  1330  2084  3664
   109   184   306   418   529   704   902  1078  1385  2172  3815
   114   192   321   434   553   751   926  1089  1400  2300  4044
   116   194   340   442   553   761   946  1127  1565  2313  4393
   128   196   362   447   554   775   950  1132  1751  2402  4678
   134   231   387   451   582   783  1001  1159  1875  2426  5971
   143   251   401   493   603   824  1002  1183  1964  2460  8077

The average is down to 1238 per grid -- to be compared with Afmob's result of ~2200 per grid across all grids.
There's still plenty of low hanging fruit, though.

Interestingly(?), there's still a 73:27 ratio between grids that are below-vs-above the current average.
blue
 
Posts: 979
Joined: 11 March 2013

Re: Low/Hi Clue Thresholds

Postby coloin » Sat Aug 31, 2019 8:43 pm

Well ,,, great analysis ... and so 1 % have less than 30 19s ... and we are less likely to get these of course.....
suppose same analysis on the found 19C grids we need to see ? [ am worried now]

We wont easily find the grids with only one 19 ... and the 250k grids without 19Cs these will be in the mix too
The 1 in 7 grids which have an 18C - these will be in the top half of the distribution ... making it more skewed presumably ...
coloin
 
Posts: 2381
Joined: 05 May 2005
Location: Devon

Re: Low/Hi Clue Thresholds

Postby blue » Sat Aug 31, 2019 10:02 pm

coloin wrote:Wwe are less likely to get these of course.....
suppose same analysis on the found 19C grids we need to see ? [ am worried now]

If Mathimagics sends another 100, I'll give them a try.
I won't be able to get to them for a few hours, and they'll probably take a lot(?) longer to count. [ I'm worried too ! ]
I think there will be a fair number of low count grids in the mix, though.
blue
 
Posts: 979
Joined: 11 March 2013

Re: Low/Hi Clue Thresholds

Postby blue » Sun Sep 01, 2019 1:19 pm

Counts for a random 100 of the grids known to have a 19C:

Code: Select all
   51   486   773  1120  1486  1936  2526  3085  3623  4779  8688 22640
   81   518   799  1125  1601  1991  2573  3097  3650  5231  8840
  286   529   823  1178  1648  2009  2618  3111  3962  5243  9497
  328   553   862  1211  1660  2053  2649  3127  4379  5647 10356
  345   577   901  1240  1697  2054  2684  3161  4455  6044 10548
  370   689   986  1273  1715  2154  2692  3262  4459  6057 10942
  393   705  1001  1304  1719  2155  2726  3403  4537  6996 12503
  439   723  1088  1352  1730  2175  2799  3472  4687  8451 13199
  441   744  1100  1391  1848  2200  2894  3551  4759  8561 13918

The average is 3317.

blue wrote:I think there will be a fair number of low count grids in the mix

Fewer than I expected, but not too bad.
blue
 
Posts: 979
Joined: 11 March 2013

Re: Low/Hi Clue Thresholds

Postby coloin » Sun Sep 01, 2019 2:00 pm

Excellent !
blue wrote:Fewer than I expected, but not too bad.

indeed but we still have what we suspected and feared ...

im no expert in stats ... but the median tells the same story
Code: Select all
median no of 19C in random grids                          904
median no of 19C in grids which we have found a 19C      1936 
median no of 19C in grids not yet determined              619

that last count will be heading downwards the more plethoric grids which we find and we will continue to find these ones preferentially .....

we wont have this problem in the 18C though, as the 18C are more likely to be found in the grids with more 19C....
... they will have at least 63 non minimal 19C plus a good few minimal {-1+2} within the same grid......
... maybe thats why we found so many 18C when we did the {-2+1} on the found C19s
... perhaps its a good idea to save all the 19C generated [ I have] and perform a non-minimality check on them all when the time comes .... if it comes
coloin
 
Posts: 2381
Joined: 05 May 2005
Location: Devon

LCT-19 Progress

Postby Mathimagics » Wed Sep 04, 2019 9:10 am

At the end of week 3, we have:
Code: Select all
 known 17-19C:   144,490,098
 added    19C:   975,263,602  Week 1
                 917,301,757  Week 2
                 451,939,607  Week 3
               =============
               2,488,995,064


New grid yields for the morph workers is currently about 25%.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

LCT-19 Progress

Postby Mathimagics » Wed Sep 11, 2019 2:20 pm

At the end of week 4, we have:
Code: Select all
 known 17-19C:   144,490,098
 added    19C:   975,263,602  Week 1
                 917,301,757  Week 2
                 451,939,607  Week 3
                 507,056,483  Week 4
               =============
               2,996,051,547

New grid yields are currently 23%.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Wed Sep 18, 2019 2:39 pm

At the end of week 5, we have:
Code: Select all
 known 17-19C:   144,490,098
 added    19C:   975,263,602  Week 1
                 917,301,757  Week 2
                 451,939,607  Week 3
                 507,056,483  Week 4
                 358,256,360  Week 5
               =============
               3,354,307,907


New grid yields are now 14-16%. I think that after one more week it will be time to shut down the morph workers and switch to explicit grid testing using blue's Find19C function.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Fri Sep 20, 2019 4:10 pm

Mathimagics wrote:New grid yields are now 14-16%. I think that after one more week it will be time to shut down the morph workers and switch to explicit grid testing using blue's Find19C function.


blue has produced an improved version of Gen19C, which optimises the {-1,+1} morph processing, and thereby increases grid generation by roughly 4 times (!!).

This means that the morph workers have a considerably extended shelf life, and will be worth running for some time to come. For coloin and 1to9only, here is a new 64-bit version of Gen19C.exe.

<zip file removed> (obsolete)
Last edited by Mathimagics on Mon Oct 21, 2019 12:48 am, edited 1 time in total.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby 1to9only » Mon Sep 23, 2019 11:46 am

In the previous Worker19C.c when this was provided, I make a couple of (private) changes:

This to speed up (I think!) batch making:
Code: Select all
      if ((NMT+np) >= MAXC) {NoMore = 1; np = MAXC-1; return;}  // failsafe

This for less frequent screen update, and a rough indication to batch completion:
Code: Select all
      if ((rn & 0x03ff) == 0) {printf("  at item %10d (%d%%)\r", rn, (NMT+np)/129366); fflush(stdout);}
User avatar
1to9only
 
Posts: 4175
Joined: 04 April 2018

LCT-19 Progress

Postby Mathimagics » Wed Sep 25, 2019 4:30 pm

At the end of week 6, we have:
Code: Select all
 known 17-19C:   144,490,098
 added    19C:   975,263,602  Week 1
                 917,301,757  Week 2
                 451,939,607  Week 3
                 507,056,483  Week 4
                 358,256,360  Week 5
                 543,137,278  Week 6
               =============
               3,897,445,185 


Yields are down to 12%, but batch production is up 350% (nice job, blue!), and so this week's new-grid count actually went up ...

Thanks to coloin and 1to9only for grinding out their batches ... keep it up, guys! 8-)
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby 1to9only » Sun Sep 29, 2019 9:36 pm

[deleted] - no longer relevant.
User avatar
1to9only
 
Posts: 4175
Joined: 04 April 2018

LCT-19 Progress

Postby Mathimagics » Wed Oct 02, 2019 8:07 am

Another week, another progress report! 8-)

At the end of week 7, we have:
Code: Select all
 known 17-19C:   144,490,098
 added    19C:   975,263,602  Week 1
                 917,301,757  Week 2
                 451,939,607  Week 3
                 507,056,483  Week 4
                 358,256,360  Week 5
                 543,137,278  Week 6
                 467,361,125  Week 7
               =============
               4,364,806,310  Total resolved
               1,107,924,228  Unresolved


The following table lists the last 12 bulk update runs (the past two weeks). Each update corresponds to 128 batch files produced by Gen19C workers at the main data mining sites (myself, coloin, 1to9only). "Grids" is the total ED grids (in millions) in the batches, "New" is the number of new grids resolved, "Yield" is the ratio New/Grids:

Code: Select all
       Date  Grids   New   Yield
  ------------------------------
  1  21 Sep    731   124   0.170
  2  23 Sep    746   121   0.162
  3  24 Sep    770   108   0.140
  4  25 Sep    772    98   0.126
  5  26 Sep    790    92   0.117
  6  27 Sep    791    84   0.106
  7  28 Sep    781    75   0.095
  8  28 Sep    793    67   0.084
  9  29 Sep    791    63   0.080
 10  30 Sep    792    60   0.076
 11   1 Oct    782    54   0.069
 12   2 Oct    789    49   0.062


Another measure of particular relevance is the cost of explicit grid testing (via blue's Find19C function). Just as blue predicted, as the pool of unresolved grids is reduced, the concentration of grids with low 19C puzzle counts increases, and so does the average time per grid for Find19C.

Two weeks ago, with 2.2 billion (40%) grids unresolved, sampling suggested that the Find19C cost per grid was ~25ms (40 grids/sec). Today, with 1.1 billion (20%) grids unresolved, the cost has risen to ~38ms (25 grids/sec).

Is it time, then, given the low yields, to switch to explicit grid testing?

The answer is, I think, no! One Gen19C worker can produce roughly 100 million grids a day. Find19C on today's figures can resolve 2 million a day, so Gen19C at 6% yield is still ~3x more productive.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

PreviousNext

Return to General