Low/Hi Clue Thresholds

Everything about Sudoku that doesn't fit in one of the other sections

LCT-19X Progress

Postby Mathimagics » Fri Nov 08, 2019 5:03 am

400 million grids done!!

Day 11.25: after 270 hours, LCT-19X has:

  • tested 402,609,428 grids (64.25%)
  • found 216,260 grids with No 19C (0.537%)

Extrapolating, the projection for total "No 19C" grids would be ~336,646.

NB: just to further muddy the waters, this figure doesn't include the ~5000 grids that we already knew had no 19C, these were all automorphic grids that I explicitly tested as a group some time back. The trend downward is still apparent, nevertheless, and I fully expect that we will converge to somewhere in blue's predicted range.

ETA remains Nov 14-15. (and now perhaps more likely to be the 14th?)
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

LCT-19X Progress

Postby Mathimagics » Sun Nov 10, 2019 5:16 pm

500 million grids done!!

Day 13.75: after 330 hours, LCT-19X has:

  • tested 505,075,588 grids (80.59%)
  • found 247,050 grids with No 19C (0.489%)

Projection for total "No 19C" grids is now < 310,000

ETA should be around 72 - 76 hours from the time of this post.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby eleven » Mon Nov 11, 2019 12:00 am

The crowd is getting nervous. blue's lower limit almost reached. And 72 hours to go. Can it blow up the upper limit? Overtime now in the betting shops.
(a dollar on no)
eleven
 
Posts: 3094
Joined: 10 February 2008

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Mon Nov 11, 2019 2:50 am

blue wrote:my revised estimate is 250-281K

Serious punters on a final figure (F) need to take into account the 4814 that we started with, which are not included in the actual counts (R) reported above.

And there is another (as yet) unaccounted factor, call it X. There was a fire in Band 6 - some records towards the end of the file were burned. Something nasty happened on the USB drive transfers from Jill to Jack, which only became apparent when I investigated the alignment between the LCTP catalogs on both machines. The file error was, naturally, not reported by Win10.

This was all quietly repaired while you all slept. But LCT19X had already progressed well past Band 6, and so its reported counts (R) only include what it found at the time. So X = F(6) - R(6), and F(all) will be R + 4814 + X, ok? :?

Want to know what X is? I do have that figure around here somewhere ... 8-)

PS: R has just passed the 250K mark (525M grids processed, 100M grids remaining), so we are already well into the blue zone!
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby blue » Tue Nov 12, 2019 3:15 am

Mathimagics wrote:And there is another (as yet) unaccounted factor, call it X. There was a fire in Band 6 - some records towards the end of the file were burned. Something nasty happened on the USB drive transfers from Jill to Jack, which only became apparent when I investigated the alignment between the LCTP catalogs on both machines. The file error was, naturally, not reported by Win10.

This was all quietly repaired while you all slept. But LCT19X had already progressed well past Band 6, and so its reported counts (R) only include what it found at the time. So X = F(6) - R(6), and F(all) will be R + 4814 + X, ok? :?

" :? " ... Yes: I am totally confused.

Mathimagics wrote:Want to know what X is? I do have that figure around here somewhere ... 8-)

Yes, please, and more:
  • What band 6 data exists on each machine ?
    What's missing where, at this point (if anything) ?
  • What got trashed ?
  • Job files were produced from a good or a bad catalog ? (Does it matter ?)
  • What do the recent R values reflect, exactly ?
  • How (exactly) is it, that they end up short by some amount X (in addition to the 4815) ?
  • [ You're positive that X represents a shortage, and not overage ? ]
I don't doubt that you have a handle on everything, but my curiosity for details, raised the itch to ask. 8-)

Cheers,
Blue.
blue
 
Posts: 979
Joined: 11 March 2013

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Tue Nov 12, 2019 5:28 am

Hi blue!

blue wrote:I don't doubt that you have a handle on everything, but my curiosity for details, raised the itch to ask.

Both sentiments are admirable! ;)

You can relax - all is well here, really - as I hope the following report demonstrates. :!:

  • there are two folders that comprise the database, ED and LC. ED is the original catalog of solution grids, LC is the catalog of low-clue puzzles. Both catalogs have a separate file for each band. Each entry is 83 bytes - in ED bytes 82/83 are CR/LF, but in LC byte 82 is a status byte (SB), used to indicate whether or not the grid is known to have no puzzles with less clues, and 83 is the puzzle size (NC).
  • on Jack's first day (Oct 28), the existing database on Jill was copied to an external USB drive, and loaded onto Jack
  • the "fire" was in the LC area, in Band #6. The last ~100K records wound up as being all zeroes.
  • the LCT19X "job creation" exercise involved reading each LC band, and extracting a list of those grids to be checked with Find19C. Grids were added to the list if the LC record had NC = 20, and SB = 0. (SB = 1 would indicate a grid already tested and known to have no 19C).

    So the grids in the fire zone were simply treated as "not applicable" - consequently the unresolved 20C (UR20) grid list for band 6 was ~98K short of what it should have been.
  • The LCT19X worker process, Test19X, simply reads these files of UR20 grids, tests them, and writes the results to a log file. NOTE: the workers do NOT touch (update) the LC catalog. Updating (Update19X) is done separately, when all the UR20 lists for a given band have been completed. Grids for which 19C puzzles were found have their LC entries replaced by the 19C puzzle + (NC = 19) + (SB = 0). Where no 19C puzzle was found, the entry simply has its SB changed to 1.
  • When I discovered the fire, and examined the scene, I realised that all I needed to do was to replace the EC Band 6 file with a true copy, regenerate its UR20 list, run Test19X, and finally Update19X. This was all done in a separate UR20 folder, so as not to interfere with the main job (24 copies of Test19X running in parallel) and its delicate job-assignment/reporting processes.
  • When the dust had settled, we had found an additional 177 grids in Band 6 that had no 19C puzzle. X = 177 !!!
  • I have 2 admin tools, CountBand and AuditBand, that count all the LC entries, reporting the numbers of 17/18/19/20/21 puzzles, and which puzzles have the SB flag set. AuditBand does additional verification, that the puzzle in LC does really solve to the corresponding ED grid. Following completion of LCT19X, the final "No 19C" count (F) will be provided by the count tool, and auditing of all bands will then be done as a final verification of the database integrity.

Meanwhile, here are the latest LCT19X figures:

  • grids tested = 579,928,871 (92.53%)
  • No 19C "R" = 257,406

Cheers,
MM
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

LCT-19X has completed!

Postby Mathimagics » Wed Nov 13, 2019 6:19 am

LCT-19 Final Report!

After 16 days (+ 1.5h) LCT-19X has finished. Both PC-Jack and PC-Jill agree that the number of grids with a 20C puzzle but no 19C puzzle is 268,296.

blue's sampling prediction (250k - 280k) has proved to be right on the money! 8-)

Jack is cooling down after its exertions (24 x 19X workers running round the clock for 16 days), and will be rewarded with a memory upgrade (to 128Gb).

Forward plans:

  • Jack (assuming it survives the DIY memory upgrade) will be set to work on LCT-18 until the end of the month. We will try to identify as many 18C grids as possible by morphing
  • Dec 1 is the planned date for commencement of LCT-17 (the identification of any/all outstanding 17C puzzles)

Cheers
MM
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Wed Nov 13, 2019 6:02 pm

Here is the "No 19C" grid list, in SUDZ compressed format.

Before compression the grid list had to be split into 4 separate sections in order to get attachable file sizes (and this took several iterations with various split-points to get them all right!). The average compressed grid size achieved was 3.68 bytes per grid - this is much higher than normal for sudz but the "distance" between consecutive grids here is pretty big.

The first 3 files are attached here. Note that the ".zip" extension is merely a contrivance, these are not zip files, but sudz files.

No19C-Part1.sudz.zip
(245.18 KiB) Downloaded 149 times

No19C-Part2.sudz.zip
(240.74 KiB) Downloaded 144 times

No19C-Part3.sudz.zip
(245.3 KiB) Downloaded 137 times
Last edited by Mathimagics on Wed Nov 13, 2019 6:16 pm, edited 3 times in total.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Wed Nov 13, 2019 6:03 pm

The 4th section:

No19C-Part4.sudz.zip
(232.31 KiB) Downloaded 138 times
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: LCT-19X has completed!

Postby Serg » Tue Nov 19, 2019 11:11 pm

Hi, Mathimagics!
Mathimagics wrote:LCT-19 Final Report!

After 16 days (+ 1.5h) LCT-19X has finished. Both PC-Jack and PC-Jill agree that the number of grids with a 20C puzzle but no 19C puzzle is 268,296.

Are you sure that grids with 21-clue valid puzzles, but no 20-clue valid puzzles don't exist?

Serg
Serg
2018 Supporter
 
Posts: 860
Joined: 01 June 2010
Location: Russia

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Wed Nov 20, 2019 2:40 am

Hello Serg!

Long time no see!

We have already established (in the first phase of this project) that there are only 4 grids which do not have a 20C puzzle ...

Cheers
MM
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby Serg » Wed Nov 20, 2019 1:40 pm

Hi, Mathimagics!
Mathimagics wrote:We have already established (in the first phase of this project) that there are only 4 grids which do not have a 20C puzzle ...

I looked through first pages of this thread and didn't find a post about exhaustive search of grids which do not have 20-clue puzzles. Was this exhaustive search done? Can we be sure that grids having 22-clue puzzles, but not 17,18,19,20,21-clue puzzles don't exist?

Serg

[Edited. I corrected some typos.]
Serg
2018 Supporter
 
Posts: 860
Joined: 01 June 2010
Location: Russia

LCT Project Review

Postby Mathimagics » Wed Nov 20, 2019 3:12 pm

Serg wrote:I looked through first pages of this thread and didn't find a post about exclusive search of grids which do not have 20-clue puzzles. Was this exclusive search done? Can we be sure that grids having 22-clue puzzles, but not 17,18,19,20,21-clue puzzles don't exist?


Good questions, to which I can answer a definitive YES.

Demonstrating this should make a good review of exactly where we are at.

LCT-20 ran from June 22 to the end of July. It ran in two phases - the first phase used randomised morphing methods to identify 20-clue puzzles, and at the end of that phase, our database summary was (from this post above):

Code: Select all
     17C:      46300
     18C:   10221135
     19C:    5626788
     20C: 4091330268
     21c:          4
          ----------
 Puzzles: 4107224495 (  75.05%)
     unk: 1365506043 (  24.95%)
          ----------
   Grids: 5472730538


In all of these tables, the numbers are backed up by evidence, that is, for each ED grid we have a catalog record which is an example of a puzzle of the corresponding size.

So after LCT-20 phase 1, we were left with 1,365,506,043 grids for which we did not yet have a puzzle recorded.

Phase 2 then tested every one of those grids with blue's explicit grid testing function Find20C. That found a 20C puzzle in every case, leaving us with just those 4 grids alone that we knew had no 20C, but do have a 21C. Phase 2 completed on August 13 (reported here), and for the first time the LCT database had an example of a low-clue puzzle for every ED grid:

Code: Select all
     17C:      46300
     18C:   10658721
     19C:  133785077
     20C: 5328240436 (*)
     21C:          4
          ----------
 Puzzles: 5472730538


[Note:] the 20C figure reported here (*) is less than the sum of (known 20C's + unknown grids) in the first table, because we had started work on the LCT-19 phase by this time, and had already converted ~128 million 20C grids to 19C.

So, the bottom line is that there are absolutely NO grids which do not have a 21-clue puzzle, and just 4 that don't have a 20-clue puzzle.

Cheers
Jim

PS: LCT-19, which ran until Nov 13, was also definitive, using the same methods (morphing phase + explicit grid testing phase) to establish that only 268,296 grids had a 19C puzzle but no 18C.

LCT-18, if it could be completed, would give us completion - a database that had a lowest-possible-clue example for every ED grid.

Phase 1 can (and will) probably find most of the 18-clue puzzle grids. But phase 2, the rigorous testing of all the remaining (~4.5 billion) unresolved 19-clue grids, is very expensive (with current best known methods/implementations) and is estimated to take several cpu-centuries, perhaps even a cpu-millenium!

So, if anybody can come up with a way to (rigorously) test a grid for the existence of an 18-clue puzzle in under 1s, they can become a real hero!
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

LCT-18 Progress

Postby Mathimagics » Thu Nov 21, 2019 5:43 pm

We have been running LCT-18 workers for the past 5 days, and have increased the 18C grid count to over 200 million:

Code: Select all
Date              Batches     ED grids     New grids   Yield     Total 18C
--------------------------------------------------------------------------
17 Nov 2019                                                    141,951,220           
17 Nov 2019       1 -   16   15,681,729    7,849,494   50%     149,800,714           
18 Nov 2019      17 -   32   15,692,880    7,547,712   48%     157,348,426
19 Nov 2019      33 -   48   15,772,186    7,475,390   47.4%   164,823,816
20 Nov 2019      49 -   64   15,840,319    6,804,915   43%     171,628,731
21 Nov 2019      65 -   80   15,648,661    6,627,220   42.35%  178,255,951   
22 Nov 2019      H18(Colin)  80,063,252   35,759,516   44.67%  214,015,467


LCT-18 workers run Gen18H, which is essentially the same as Gen19C - it generates 19C puzzles and simply "harvests" those with redundant clues.

The last entry refers to grids which were retrospectively "harvested" from old LCT-19 batches that coloin had retained (I had carelessly cast my 19C batches aside once they were processed!).
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: LCT Project Review

Postby Serg » Thu Nov 21, 2019 6:49 pm

Hi, Mathimagics!
Mathimagics wrote:
Code: Select all
     17C:      46300
     18C:   10658721
     19C:  133785077
     20C: 5328240436 (*)
     21C:          4
          ----------
 Puzzles: 5472730538

So, the bottom line is that there are absolutely NO grids which do not have a 21-clue puzzle, and just 4 that don't have a 20-clue puzzle.

Thank you for clarification! The last question. Do you mean that each line in the table above represents number of ED grids, for which minimal number of clues in valid puzzles is equal to 17C, 18C, etc? For example, can we be sure that there are 133785077 ED grids, not having 17-clue and 18-clue valid puzzles, but having 19-clue puzzles. Right?

Serg
Serg
2018 Supporter
 
Posts: 860
Joined: 01 June 2010
Location: Russia

PreviousNext

Return to General