Low/Hi Clue Thresholds

Everything about Sudoku that doesn't fit in one of the other sections

LCT-19 Progress

Postby Mathimagics » Wed Oct 09, 2019 8:25 am

At the end of week 8, we have:
Code: Select all
 known 17-19C:   144,490,098
 added    19C:   975,263,602  Week 1
                 917,301,757  Week 2
                 451,939,607  Week 3
                 507,056,483  Week 4
                 358,256,360  Week 5
                 543,137,278  Week 6
                 467,361,125  Week 7
                 211,939,044  Week 8
               =============
               4,576,745,354  Total resolved
                 895,985,184  Unresolved


A slow week - not because of reducing yields so much as a reduced batch production level (we've lost one of our 3 data mining sites!):
Code: Select all
Week 8:
       Date  Grids   New   Yield
  ------------------------------
  1   4 Oct    805    50   0.062
  2   5 Oct    800    46   0.057
  3   6 Oct    772    37   0.0482
  4   7 Oct    810    41   0.0504
  5   8 Oct    805    39   0.0478


Good news, perhaps, in the last 3 yield figures - all ~4.8%. Explicit grid testing based on current sampling costs 40ms/grid, and this corresponds (roughly) to worker yields of ~2%, so we will keep on going with the current mining operations! 8-)
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

LCT-19 Progress

Postby Mathimagics » Wed Oct 16, 2019 9:22 am

The state of affairs at the end of week 9 ....

This week we produced 6 batch sets (each set has 128 batch files), the grid counts are millions, rounded sometimes, the yields quoted are exact however):
Code: Select all
       Date  Grids   New   Yield
  ------------------------------
  1  10 Oct    800    35    0.0433
  2  11 Oct    784    30    0.0387
  3  12 Oct    784    30    0.0387
  4  13 Oct    736    24.5  0.0333
  5  14 Oct    641    21.6  0.0337
  6  15 Oct    630    18.9  0.0301


We resolved 160 million grids:
Code: Select all
    975,263,602  Week 1
    917,301,757  Week 2
    451,939,607  Week 3
    507,056,483  Week 4
    358,256,360  Week 5
    543,137,278  Week 6
    467,361,125  Week 7
    211,939,044  Week 8
    160,648,812  Week 9


The current LCT catalog state:
Code: Select all
 Totals:
     17C:      46301
     18C:   85663604
     19C: 4651798500       59233 have no 18C
     20C:  735222129        4814 have no 19C
     21C:          4
          ----------
 Puzzles: 5472730538 ( 100.00%)
     unk:          0 (   0.00%)
          ==========
   Grids: 5472730538


The "No 18C" count appears for the first time. I'm slowly working my way through the automorphic grids (as I did for 19C), testing them with the new Find18C provided by blue. I've only tested around 10%, but clearly the great majority have no 18C puzzle. (is the actual count known?)

The unresolved 20C grid count is 735222129 - 4814 = 735,217,315. The rate of yield decline is low. So we push on!
Last edited by Mathimagics on Wed Oct 16, 2019 11:49 am, edited 1 time in total.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Wed Oct 16, 2019 11:46 am

The average expected time for explicit grid testing of the unresolved 20C grids remains at 38ms.

As blue pointed out, Find19C times are generally higher for grids with low numbers of 19C puzzles. The morphing process tends to favours grids with many 19C puzzles. So there would be a tendency for the unresolved grids to have lower 19C puzzle counts, and we were concerned that this could become a significant problem.

Blue wrote:Find19C is very slow on grids with no 19.
  • 250 random such grids, took ~3.33 seconds each, on average, to process.
  • Grids with a small number of 19's, can take just as long.


The correlation between 19C puzzle counts and testing times is complicated (fortunately, as it happens)

Consider the pattern-based search as a simplified linear model, ie "for each possible clue pattern, does that pattern on this grid give a valid puzzle?". The ultimate cost-determining factor is the pattern index of the "best" puzzle, the one that has the lowest pattern index.

A grid with 50 puzzles might take longer than a grid with only 1 puzzle, if that 1 puzzle happens to have a lower pattern index.

This creates an "amelioration" factor, a blurring of the puzzle count/time relationship, the effect of which cannot be predicted, but can be clearly observed in our LCT-19 case. In other words, we got lucky!
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

LCT-19

Postby Mathimagics » Fri Oct 18, 2019 12:22 am

A better indication of the efficacy of the LCT-19 morph workers is obtained by looking at the estimated completion times (ECT). That is the number of core-days required to do all the unresolved grids:

Code: Select all
        Added   
    467,361,125  Week 7   ECT = 487
    211,939,044  Week 8   ECT = 394     (-93)
    160,648,812  Week 9   ECT = 323     (-71)


For the past 2 weeks we have been running Gen19C on 8 cores (I think that's right, if coloin is running 4) so any reduction of 56 core-days or more indicates that it's still worth running.
Last edited by Mathimagics on Sat Oct 19, 2019 1:09 pm, edited 1 time in total.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

LCT Project

Postby Mathimagics » Fri Oct 18, 2019 3:48 pm

Some news of interest for this project!

New PC

I am investing in a new PC, an AMD Ryzen 2950X ("ThreadRipper") with 16 cores (32 threads), which has ~4 times the processing capacity of my current system (Intel i7-7700K, 4 cores, 8 threads). It will have 64Gb RAM against my existing 32B, and a couple of other useful extras that I hadn't thought of when ordering the current system (a 1TB fast-access SSD drive just for the ED+LCT catalogues, and a better sound-card). All this for around 1.7 x the current PC cost, which seems like a reasonable deal.

LCT-18

I've been doing a little bit of preliminary testing of the Gen18C code, which is essentially the same as Gen19C but with different seed generation. Have produced some batches (have got 127 million 18C grids, which is maybe 13% of the estimated total). I'll post about this below.

LCT-17

blue has kindly offered to hijack my new system for 3 months or so in order to finish off the 17C search with his "divide and conquer" app. Between us we have 20 cores - and 2200 core-days is the estimate to complete the search - so that's roughly 110 days. Hopefully we can reduce that with a little boost from hyper-threading, charity donations, begging, etc ("spare some core-days, mister?").

And of course there are the negative factors like power outages, breakdowns (human and/or computer), entropy, Brexit, etc etc 8-)

For the LCT project, completion of this job means knowing we have all grids with 17C puzzles identified, and that any 18C grids we identify apart from these are the "best possible #clues", and that's effectively LCT-17.


The "Little Ripper" will initially be used to:

  • do Find18C on the remaining automorphic grids (est 1/2 days, a gentle warmup exercise)
  • do Find19C on the unresolved 20C grids (est 2-3 weeks)
  • perhaps do a little bit of LCT-18 batch production

Then, in say 4 weeks time, it will be kickoff-time for the big one, LCT-17 !
Last edited by Mathimagics on Sat Oct 19, 2019 1:10 pm, edited 1 time in total.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

LCT-18 (so it begins ...)

Postby Mathimagics » Sat Oct 19, 2019 1:06 pm

LCT-18 Overview

The software components involved here are:

  • Gen18C: the batch generator, a clone of Gen19C, slightly modified
  • Find18C: blue's explicit-grid tester function
  • UpdateCat: updates the catalog from batch-sets (64/128), no change needed

Gen18C does just the same {-1,+1} multi-pass propagation process as does Gen19C. The only difference is that we use Find18C for 18C-puzzle seed generation, since finding them by reduction methods is hard.

Seed generation will be done here (ie my PC) where we have access to the catalog (for every seed we generate we get ~6 grid resolutions). coloin will be provided with "seed pods" ...

As with LCT-19, we would hope to find most 18C's quickly by this method - prelim testing (more on this to follow) suggests that we get lower yields, but we have a much smaller target (~0.97 billion) so I'm guessing (wildly) that these will balance out ...


LCT-18 - The Longer View

blue's estimate of the number of 18C grids (~0.97 billion) means that there will then probably be, say, ~4.5 billion grids, for which all we know is that they will have a 19C, and almost certainly have no 18C. And only Find18C can tell us for sure..

February 2020 might see us reach that position, given fair winds. Then only this last step will remain, we will have finished both LCT-17 and LCT-19. Mind you, back in June, I never thought I'd be even contemplating getting to this point. Now I find myself estimating a notional "time to completion".

Ok, simple maths, assuming we get there, 4.5 billion grids to check at 4.23s is roughly ~600 core-years. Urrrgh, I hear you say, but consider this - it is 200 less than the legendary "No 16C" job took! 8-)

"LCT - The Complete Series". Direct to DVD. Featuring blue as "Shelley". ;)

I can see it now! Ok, I am using a large telescope ... :lol:
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: LCT-18 (more)

Postby Mathimagics » Sat Oct 19, 2019 2:38 pm

LCT-18 Batch properties

I have produced some test batches, rather slowly, using mostly the old (but soon to be "ex") PC, enough for 4 or 5 decent update runs.

Code: Select all
                                                             new/#ED   new/#Raw
 Set Batches  Date  Raw grids   ED grids               New    Yield     Yield       nSeed
 --------------------------------------------------------------------------------------------------
   1   128  Oct 10  268435122   65811495 (0.245)  61766034   0.9385    0.2301       10
   2   128  Oct 13  268435328   30340535 (0.113)   7255723   0.2391    0.0270       10
   3   128  Oct 16  264241026   27598684 (0.104)   5971153   0.2163    0.0226       10
   4    64  Oct 17  368440278   93852582 (0.254)  34165952   0.3640    0.0927       20
   5    32  Oct 18  186574856   63881415 (0.342)   7776707   0.1217    0.0417       40


I have added columns showing the number of grids in the raw batch files. That's because the resulting ED grids actually resulting from the merge is neither stable, nor large. LCT-19 had both of these right through to the end, but we are in a different world here.

The {-1,+1} connectivity of the 19C grids is complete, and most grids have 100's if not thousands of direct connections. It's pretty easy to get around.

Planet LCT-18, on the other hand, is over 80% ocean, with the grids scattered across a sprawling archipelago of islands, some large, some barely there at all. There is no boat technology here, the inhabitants completely isolated. Road connections are barely sufficient to connect the grids on these islands. Rumour has it that there is a small continent, which might have 50% or 60% of the population. But nobody really knows ..

Ok, we get it! Most grids don't have any 18C, those that do don't have many, and have limited direct {-1,+1} connections.

Some rough sample data for 18C puzzle counts (blue made an Enum18C for us). Two sets of 100 pseudo-random grids, all having an 18C:
Code: Select all
   n18C     Grids
   ---------------
     1     41   27
     2     19   24
     3     15   13
     4      8    7
     5      3   11
     6      4    4
     7      1    3
     8      -    1
     9      1    1
    10      1    1
    11      1    1
    12      2    -
    13      1    2
    14      -    2
    15      -    1
    20      1    1
    21      -    1
    43      1    -
    49      1    -
   ---------------
          100  100


I did a Gen19C using one seed at a time, which gives us some sort of vague idea of the "island" distribution. ~130 seeds were tested individually. In a fractal kind of way, some converged, other did not. I used a bailout initially of 16K grids, but that's too slow, and got the following results with a 5K bailout:

Code: Select all
Island sizes for individual seeds:
    Size    Count
  ---------------
       1      27
       2      11
       3      15
  4 -  9      14
 10 - 50       9
     137       1
     971       1
     BIG      56 (> 5000)

coloin will be pleased to see so many "untouchables", I trust?

The raw data and results for each seed are here:
The Gulag: Show
Code: Select all
1.............9.2.78..3......4..5.9....81.......2...5...2...6....5............3.8 #    3
...4..78..56...................7.....61.....5..48...3.3.....84.....6..1.....9.... #    2
12....7......89...7........2..........4....98....4..5...85......6.....4......12.. #      Big
1...5.......7....3.......6..37......6...1........2.4.......491..683.............5 #      Big
.2...67..........3..91............916....7....4..2..5....5...1.....9.....7....4.. #      Big
1....6......7...2..8....5....794....5.......8...2.......2........45..........38.1 #    2
...4...8...6......7...3.........5..6..58...........3.7..4.6..12....71.....8...... #    5
.....67......8......91......4....6.8...........12......7.9...1286..4...........3. #    2
1......89.............32......8....5.37...2...4....6.............4.7.3..8..9....1 #      Big
..3....8....7......8.....642.....9......6..4...75..3..5.....2.7.............48... #    1
1....6....5.....2.........4....1.9....8..73....45.....3.....6.5...24........7.... #      Big
....5...9...7......89.....4.6....35..1...8.......2..7.5.....2....4...........4..6 #      Big
12.......4..7.9..3...........83....7.........9...2..1......1.4...58..3....7...... #      Big
.........4...8...3..91...6..6.........1...4.......3..8......91.8...75..2.3....... #    1
12....7......89..3....3.....7..1..........4...6...52..3.......8.4..........2.4... #    1
..34............2.7......6.......41...85..3......7.....1.....76...3.8....6.2..... #    1
.2.4...........1.3......56...59...4.3.1..5...6...............9.......2......138.. #    5
1.....7.9.5............2..4...3.5........4...9.......6....7....6...9..1...2...35. #   38
..34...8........2.7..........8....4....2.19..6....7.....49......1....3.6........7 #      Big
.2....7..4....91...8......4.......5....2....89.1.........6........82..3...4...9.. #      Big
..3...7...5..........1...6.2...7...1........8....43....4.8.....6..9.5..2.7....... #    3
.23........67..........15.4............1...3.8....4....4...8...5.......7...3..61. #    1
12....7..4...89.........5....5.....6....1..9...73.........7.......5.3...9......4. #      Big
1............89...7......64.......1...8.429.............2...8..6..5........1..47. #   46
..34....9....8....7......6.....6..1............4.2.8...9....4....2...3..6...15... #    1
1......89.......2.7...31......9....76...4..........3....28......97..........6..5. #    9
...4..78.45...9..........6....8..31..........9....5.............186......7...4..5 #    3
1.....78.4.6..9...............64.......8....2.39....1.........6.7....9...8.5..... #      Big
.2...6.8....7.....7..1....5...5....7.4........3..2..........6..5.1......9.....23. #      Big
.2.4.6............7.8..........98.3..4.........7....9.5...73...........29.....6.4 #   15
1.3..6......78..2................3...7.........5..34.1..........8..9..7....5.16.. #      Big
............78...3.9....54.2....4............867.....2...6....7.4...5...9......1. #      Big
....5...9...7......98.....62........5.............4.18.6...1.......9..7...1...35. #    1
1.3........6....2......254......5...6.......79....4....4.....9....69.8.....3..... #    1
...4.......6....2..9.13.....3....4.15.9..7.........2....1......8....5.7..4....... #      Big
............7...23.98......2......7.6..9...5......4....81...4.....3..8.....26.... #    7
...45..........1.3........6.3...7.....5...8......6..9......1....6.2.3...9.4....5. #    1
...4..7.9.5.7.............6.3..6..5..1.2...3.....9......2.........3.....9.7..4... #    4
..3.....94...8.1..................35.........86..1........4.6...1.9.......52.3..7 #    1
...4...8......9...7.....5.6.453......1...........2...7....6...2.3........84..7... #      Big
....56...4......23...1...4....8...978.......4.6..2.....1....6..............9....8 #    3
....5..........1.379........4.9.8.......2..7..3......5.........5.2...9....13.4... #    1
...45...9..6.8....7.....5...4..1..........3.........753...27..........648........ #    1
..3.....9....8.1..7....2.....49.7..5..........81...6..3....5..7.1...............2 #      Big
......7.94.6.8......8...5...7...5..........6....3.72....42.........6..5.......3.. #    3
1...56....5.....23.......4.....1.9....5.......3..........2.8..7...3.....9.7...6.. #      Big
..3..6....5.7...2.........4.14...6.8......3..8...2........48........1.....7....3. #    1
.2......9...7......9.1....4....6......5........78...1...1....5.....2.....6..943.. #      Big
.....6...4......2.7.......42...4..9..3...8.........3...81.....6.6...39......7.... #    3
..3.....9....8.1..7....2...21..4..7.............6...35.8....9.....3........5.1... #    1
1...5.......7....3.9.....6....9...........41..6.3.....5...4.....3...8..7....1...6 #      Big
..3......4......2....1...6..1...8....6.5...1......43......2..9..7.9.....8.....4.. #      Big
.....6....5.7......9......42...48...3.....9.......1.5....92....8.......1..2....7. #    1
..34.6.....67...2.......5..2.......75...1.....8..9........2......73....69........ #    4
1.3.56......7...2...8..........18........3....7........6....9.1.4.2...7.........8 #    3
.2..5....4.......3...1..5....5..........24.7.8..............94......38....167.... #    9
............7...23.98......2.....4...6....9....53.....3.2.....5....9.6......4...7 #      Big
.......894........7...32.....95........8...........37.3...78.....59....6....4.... #    2
1...5......6.....3........42.....95......8.1......4....3......7.84........762.... #      Big
........9....8.1..7...32...2..9...5.38..........5....6..96.........1.3....4...... #      Big
...4....9.5..8....7...3......9........4...6......7..3...29..4.......1....8...5.7. #      Big
12.......4....9.........56...9...3......6.......21...7.......1...5.4...2..7..5... #    2
..34.........8....7......6.21.9..........5.3.8.....4.....5.7.........8..9....1..2 #    1
.23...7.......9...........4....6.....8.37....9......51.........5.1....9..6..2.3.. #    2
.2...6...45.....2.....1........4.8........9.1.75...........2.56..1......9...7.... #    3
.2..........7....3..8.1.......5....65..9......4....2.83......95..........1..24... #      Big
......7...5...9....9...3....3.....98..4......6...2...1..267...........3...7.4.... #   27
1...5......6....2.....1..64.....4.........9..9.....37..62.....5...89.....4....... #  137
....5...94........7.8...........8..5....9...6.1...4......1.....8.....47...2.6.3.. #    3
.2...6.........1........5....5...91...7.3........28.....91....6.8..67..2......... #      Big
.2...6........91........5........94..4..7...8...3.5.......2...75........9.1.....6 #   11
...4...8...67......9....5..24.......8............61.7.......2.7.8..9...6....3.... #    1
...4.6.8..........7.....5..2....7.......9.....3.1...4...9..52.7..........143..... #      Big
...45......6...1..7.8.......45.....7.....2..8.3........1.....4......7.3.....38... #    2
.23...7...5..............6..........6.95.....8....72.1...6...9......2....7..41... #      Big
1.......9...7...2...8..3...........5.3.92..........6.......584..7....3..9...6.... #   52
...4...8......9.2.7..........18....7......6...841.....5...2....6...7.....3.....4. #      Big
...4...8...6......7.....5.....3.......7.....6.3.1.8.....2.7.....8....41.......93. #   53
1.34..........9.2........6..7.....9...48.......51........5....1........8.69.7.... #      Big
.....67.94...8..........5..2....5...38.....4......7.1.......83...9.......17...... #      Big
12.....8....7.9..........6..84...........29...6......2...........7.6..4.9..3.5... #    8
.234...8....7.9.....8.......1......6..432.....3.............24........3.9....5... #    1
123.....................654...56...........4.8.7.....1.....1......3.7...96....2.. #    3
1........45......3...2..6.......5....67...9.....8.1...3....4..1...6....7..9...... #      Big
...4......5......3...2..6..2.4......6..9..........1..7.....7..5..9...4..81...5... #      Big
..3....8......912.7..2.....2....8...6....4.......3...5...67......9...3..........7 #   22
..3.56.........1.....2....424......1......37.6............9....81......2....35... #    3
.2....7...5...9.......3..6....5.8...6......1......2...3.9.1.......6..8...4......5 #      Big
....5...9..67..1...98......2..6.......48.......7.....253..2.......1..8........... #      Big
...4.......678..........5...17.....6.8...2........53..5...23..............4....17 #      Big
.23.........78.....9.........75...........2...6...24....18....7.4...36..........5 #    6
..34.........8.1........5..2...6...............43.7....6.1...478...2..9........3. #      Big
.......8.4....9...7..2...6..3..7.4.....61..........9...6..4.........3.....2.6..1. #      Big
1......8....7.9......2....4........76..5.....84.....1...7.4........6.83...2...... #    9
.....6...4......2....2.15......4.9...8..9.....1.....5......8..6..2.....19...7.... #      Big
.2..........7....3.9...1.......6........249..8.5....7.3.7.....55...9.6........... #    3
...4.67...5.....2.....3..........8..37....6.5..91.........9..1.68..............4. #    2
..3.5.7........1.....2..5..2.......6....75...8...4.......9....8.71........4.....2 #      Big
..34.....4......2......1.6.....6...............5.2.3.7...8.74..6.9.......1.....7. #    1
...4..7.9.5..8............42............1....9.4....5..179......8............5.32 #    1
......789......1.....23....2..5.......5.....6.7...8.............8..17...6.2.....5 #      Big
.2......9...7..1.......15....7...3......6...8.4..2....5...9...2..15.............6 #      Big
..3...7......8.1.....2.....2......5.....7.69..8...3....7..........1.....9..5.2..6 #    1
1......8......9.2.7............6...73.5.....1.89..2...........6.4..1......25..... #    1
.....6.8.45........9........1....9.......8......3.25..3..9.......2.1......7...46. #    3
....5.7..4.6........8.3.....1..........8....4.7....3..53..7.........2..8....1...6 #      Big
..3.....9....8.1..7.8.3.....1....4......4.....6.....7..4.1.5..........32...9..... #      Big
........9....8.1..7..23.....1....83...5.........6...7.5.2......6.............83.1 #    2
.2..56...4.....1.3............1..9...6...........2..7..7.....6...134.......8...5. #      Big
.......8.4.6..9...7........2.......7.3.5....2...36.....1...........74....8....51. #    9
....567............98....4..4.9.....6.....5.....8.2...3...7..6...21....8......... #      Big
...45.7....67......98.........1.....5.....3....7..82..31.....................2.68 #   49
.....6.8.4......2.79..3........9...43.5....6.....7......28......1.............4.7 #    1
1....6..9.5.7...............743.........9...2..5.......8....57....1...3.6...2.... #      Big
.2......9...7....3.....1.....1.9...75....2.......4.....4.1...6..6....2.....3..5.. #    1
12...6................3..45..41...9.5............7.2...72...8.....5......6...8... #    2
1.......945...........3.......8.......9....316....5.7.....6......2.9.....8....5.4 #      Big
1.....7...5......3....3.6.......8........4.587...6......4...9...3...5......8...1. #    1
1......8......9.2..9..31........73...4.5.......8............9.7.......5...286.... #      Big
.2....7.94.......3.....1.....5.6....7..92...........6.3...9...........1...1..8.5. #      Big
........945.7.....8.........1.....7....5..4....9.....2..2.43.......9...1....2..5. #    1
..34...........12.8.....6..2...1.8......9.......3.......9.68...........46...2..5. #    8
1..........678..3...9...5....7....9..3...5.......124.........7.5....1.......4.... #      Big
........9..67...32.8........1....4....72........56................3...2784...1... #  971
12...........8......9...5.6.....16....4...8...9.2.7..........7....3.2.1...8...... #    6
....5...94...8......91....6..........619....7......85....6.72..5.....4........... #      Big
1...5..8.4....9........35.6.37.........81..9..............4..1...........6...7..3 #      Big
..3.....94...8....7...2....2.8...4..............5....3....4.62.......8...159..... #    1
....56...4...8....7.9....4..4....8.......26....1....2....9...1..6..4......2...... #    6
..3.....9...7.9..2.8.......2....1.........86..6..4.3......3.4.......4...9.......7 #      Big
1.......9....8.....8..23.............7....82.9.5.......3....4.16....2......9....5 #    3
......7.9.567......8..........6.4.5.3........91......8.............91..3..2...6.. #      Big


Two questions to which I currently do not have answers:

  • why are the ED/Raw yields so low? In some sets there were nearly 90% grid duplicates.

  • The New/ED yields are also anomalous - the first looks normal (there were barely any to start with), but to immediately plunge to 24% from 92% seems a little premature! :?

As you can see, I am looking at these for the first time myself basically ... still have this ^%%$%^$ flu-thing-bug going on. Maybe it's obvious to someone else .. sure hope so!

(I have also done a quick health check on some indivdual batch files in each set - all look kosher, no grid duplicates).
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Sat Oct 19, 2019 4:25 pm

Ok, I will do that, Mladen, thanks. 8-)

I'm sorry to have to ask, but what does "twin" meaning here? :oops:

[Update]
Ok I do have a couple of thousand 17C grids in each batch, they do turn up in the morph process, obviously.

But in pretty small numbers ... the update process ignores them ultimately when it checks the catalog entry and discovers that its a known 17. So they certainly have a yield-reducing effect but a very small one.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Sun Oct 20, 2019 2:08 am

Rebooting ... 64kb Ram available
dobrichev wrote:I would suggest checking for "twin" puzzles and for minimality, if not already done.


Like, for example, producing 18s by way of 19s which have a redundant clue? :)

I knew that the 19C workers would in fact turn up some 18C's this way, as indeed they did. While I was waiting for blue's Find18C to arrive, I found that I could get some 18C seeds for test purposes, very quickly, by simply looking in the LCT catalog, checking random entries until I found a 19C with a redundant clue. The success rate was, I believe, about 1 in 2000.

But perhaps I have failed to exploit this resource. :(

I really do need to look further into this. How many valuable 18C's has Gen19C (the batch producer) actually been throwing away? :?
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Sun Oct 20, 2019 3:06 am

dobrichev wrote:"Twins" are puzzles with same exterior (cell values for the initially non-given cells) but different interior (some givens are changed). They are easy to produce by a single solver call. For small number of givens they are not very frequent but make a bridge to a valid puzzle that differs from the original by several values at once.

Thanks, Mladen, this is also an interesting idea ... 8-)
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: LCT-18 (more)

Postby Mathimagics » Mon Oct 21, 2019 12:44 pm

1to9only wrote:Can this PC be setup with an FTP server to allow results to be 'pushed' to you, rather than you currently 'pulling' from various locations (with Edge issues!)?

Yes, it probably could, but it won't be necessary - there is no longer any need for any further external batch generation (Gen19C). The switchover to explicit grid testing is happening any day now - and my new PC (aka "Jack" the ThreadRipper) is on the way (or so I am led to believe).

Your FTP setup was an effective and simple-to-use option that served its purpose quite well, I think. Once I got the hang of it, "pulling" your batch files was really no effort at all. Thank you!

Cheers,
MM
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Low/Hi Clue Thresholds

Postby Mathimagics » Tue Oct 22, 2019 1:35 am

No, you are right, it certainly looks like "we don't need you any more, buddy!" :?

I had removed a lengthy section that outlined the plans for completing LCT-19, and instead PM-ed that section directly to 1to9only.

The remainder was what got posted above. I hadn't realised that this abridged version was quite so, um, brutal ... :lol:
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

LCT-19 Progress

Postby Mathimagics » Wed Oct 23, 2019 4:54 am

Well, what a clusterf*ck Week 10 turned out to be!

For a variety of reasons, we only processed 4 sets this week:

Code: Select all
       Date  Grids   New   Yield
  ------------------------------
Week 9:
  1  10 Oct    800    35    0.0433
  2  11 Oct    784    30    0.0387
  3  12 Oct    784    30    0.0387
  4  13 Oct    736    24.5  0.0333
  5  14 Oct    641    21.6  0.0337 
  6  15 Oct    630    18.9  0.0301
Week 10:
  1  20 Oct    300    10.2  0.0338 
  2  21 Oct    395    13    0.0328 
  3  22 Oct    766    21.4  0.0279 
  4  23 Oct    522    14.5  0.0279


We resolved just 59,112,156 x 20C grids all up.

The ED grid counts had started to decline towards the end of Week 9, and it was only when the first of the Week 10 sets was processed that it was clear something was going wrong. That turned out to be the result of a change I made to seed generation, a change that only affected my copy of Gen19C, not Colin's. I'll just say that I broke something that didn't need fixing. :oops:

The 3rd set results look completely normal because that set had only Colin's contributions in it. The first 2 were, obviously, only my batches. The 4th set was a blend of my batches, some produced before I fixed the problem, some produced after. Hopefully normal service has been resumed at my end!

The unresolved grid count is now 676,224,216.

My new PC, Jack (the Ripper) has not yet arrived (that has also been a clusterf*ck), but allis now well, it's ready to rip, and delivery is expected tomorrow.

All being well, next week's report should have a lot more sets!
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

LCT-19 Explicit Grid Testing

Postby Mathimagics » Sun Oct 27, 2019 8:09 am

The end is in sight!

LCT-19 Completion (Explicit Grid Testing)

An executive decision has been made to cease batch production forthwith and proceed with explicit grid testing.

coloin has been instructed to kill off his workforce, and to deliver his final batches over the next couple of days.

The unresolved 20C grid pool is expected to be ~600 million grids, with an estimated completion time for explicit testing of ~280 core-days. That should take "Jack the Ripper" around 18 days to complete, and perhaps less if I can use hyper-threading effectively. Gen19C used 5GB per copy, so I waslimited to 12, but that won't be a problem here - perhaps I can get an effective 20-core throughput by doubling up on half the cores. We shall see ...

The processes involved here are much simpler than they were for the morphing stage (batch production), thank heavens! :?

  • CreateWorkUnits: this is just a pre-processing function that extracts all the unresolved 20C grids and organises them into "work units", each consisting of 100,000 grids. One unit takes about 60-75 minutes for a worker to process
  • Test19X: this is the worker process, described below
  • Update19X: this updates the LCT catalog when all of the work units for a single band file are completed

There will be 3 folders, jobs (ie work units) pending, jobs in progress and jobs completed. Each Test19X selects the next pending job, loads the grid list into memory, than starts crunching it, testing each grid with blue's Find19C function.

The job state is recorded (the memory table is saved) every 5 minutes, giving us a good restart recovery ability. On completion the final version is saved in jobs completed, and the job removed from the pending folder. Every 20C grid will be changed to either a 19C entry, or marked as known "No 19C".

All pretty straight-forward. The job folders will fit on the same super-fast SSD drive where the catalog itself is located, so IO will definitely not be a drag.

That's the plan! 8-)
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

LCT-19X Progress

Postby Mathimagics » Tue Oct 29, 2019 12:18 am

LCT-19X

Ok, we are up and running. Our unresolved 20C grid count at the start of this completion exercise is 626,731,783. This translates to roughly 280 core-days.

There are 6400 jobs to process, each job representing a work-unit of 100,000 grids to test. The division is done on a band basis, so for each of the band file indices 1 to 255, there is at least one job with less than 100,000 grids. (Band files 250-255 are composite, they span multiple ED band indices).

Despite the fact that our morphing process eventually resolved ~88% of all the 20C grids that we started with, it is clear that some bands were significantly harder to hit. Here is a small sample of some of the larger bands, from the report produced by CreateWorkUnits:

Code: Select all
03:33:35: Processed  b006: ng =  96,229,043, n20 =   2,106,353, jobs =  22
03:34:32: Processed  b020: ng =  88,782,527, n20 =   1,554,894, jobs =  16
03:34:48: Processed  b022: ng =  85,627,560, n20 =   1,476,242, jobs =  15
03:35:03: Processed  b024: ng =  85,102,374, n20 =   1,553,046, jobs =  16
03:35:57: Processed  b032: ng =  40,697,708, n20 =   7,903,756, jobs =  80
03:37:40: Processed  b033: ng =  80,468,664, n20 =  20,258,707, jobs = 203
03:38:27: Processed  b034: ng =  79,175,611, n20 =   9,110,207, jobs =  92
03:39:21: Processed  b035: ng =  77,979,784, n20 =  10,512,282, jobs = 106


Band 6, the largest of all, was 97% resolved, but band 33 (the largest # of jobs) was only 75% resolved. One suspects that this is not down to sheer chance, but that band 33 probably has less 19C puzzles on average, and is highly likely to yield significant numbers of "No 19C" cases.

There is very likely to be a strong correlation between the % of unresolved grids in a band, and the average cost of testing grids in that band.

We have data for the first full day of processing now, and we have tested 35 million grids. We are using hyper-threading, running 24 workers on 16 cores, and the net effect appears to be close to a 20-core system. With the 280 core-day estimate that would translate to roughly 14 days for the whole job.

Some snapshots of progress for the first 24 hours:

Code: Select all
08:40:25: Start
13:42:05: Jobs complete =   69, i/p =  16, NGT =   6979931, NF =   9763  (0.140%)   5h
14:41:06: Jobs complete =   88, i/p =  20, NGT =   8815525, NF =  10425  (0.118%)   6h
16:45:55: Jobs complete =  124, i/p =  24, NGT =  12581337, NF =  12388  (0.098%)   8h
17:42:00: Jobs complete =  142, i/p =  24, NGT =  14273465, NF =  13378  (0.094%)
18:41:04: Jobs complete =  162, i/p =  24, NGT =  16072711, NF =  14560  (0.091%)
20:40:38: Jobs complete =  193, i/p =  24, NGT =  19820132, NF =  16294  (0.082%)  12h
22:50:51: Jobs complete =  243, i/p =  24, NGT =  23585231, NF =  17564  (0.074%)
00:52:35: Jobs complete =  267, i/p =  24, NGT =  26091727, NF =  19999  (0.077%)  16h
06:40:04: Jobs complete =  336, i/p =  24, NGT =  33198587, NF =  30312  (0.091%) 
08:42:08: Jobs complete =  359, i/p =  24, NGT =  35310171, NF =  37053  (0.105%)  24h

NGT is the total # of grids tested, NF the total "failures" (No 19C).

As you can see, we might have done better had we started with 24 workers, but I wanted to observe Jack's performance/running temperature, etc just to be sure everything was shipshape (and it was! Jack remained cool and whisper-quiet all the way!)

The % of fails drops steadily for some time, then kicks up again, and this corresponds to the point at which we started on bands 32 and 33. So no surprise there - if you look closely you will see there is a corresponding decrease in overall grid-test rates.

And this translates to varying "total job time" estimate predictions, all in excess of the theoretical 14 days. 16 days (based on 1st 12 hour period), 20 (the second 12 hours), 18 for the full 24 hour period.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

PreviousNext

Return to General