Some data in line with the previous post.
I intended for long to run a test on the “loki family”, to set the search parameters to reasonable values
. getting a runtime closer to one second per target if possible,
. giving a good chance to catch another family if any.
The “loki family” in my mith’s file covers 67539 solution grids.
67539 / 5472730538 this is
0.00123% of the solution grids.
From Blue’s work, we know that 1/3 of the solution grids can give non degenerated tridagons.
Most of them are easy to solve, but we have good chances to find other families of interest.
If we want to find these families through a direct scan of the solution grids, we must process all possible starts in 1/3 of the solution grids, likely around 2.7 billion starts.
From my 67539 solution grids, I got 103782 possible starts in 54783 solution grids.
Blue has seen 47963 solution grids with the loki start.
We have at least 2 reasons to explain the 6820 deviation:
I am not sure that we have the same file,
We know that many puzzles in mith’s file don’t have he “loki tridagon” in the final solution grid.
One example is given here.
http://forum.enjoysudoku.com/tridagon-t45572-30.htmlThe fact that one solution grid has no loki tridagon can not exclude another start in the same grid.
I did several tests on these 103782 starts, working on chunks of 8000 starts.
The main test was to run the code with the finder looking for puzzles of size <=26.
The average per start was around
3.2 seconds of runtime
17000 puzzles to rate
Keeping in mind that the proof that no 16 exists took about 3s per solution grid, this is feasible but requires huge power.
And here, we still have to rate the puzzles, more or less doubling the runtime using filters.
In the main test, the next steps were to clean the output of the finder from puzzles having no chance to reach the high ratings expected in the loki family.
The cleaning is done on 2 parameters:
As Mith’s file is T&E(3), all puzzles T&E(1) are discarded.
If we expand puzzles on easy moves (singles + ???), the chances of getting a high rating decrease sharply when the number of clues grows. It seems that >31 clues can be ignored without great risk to missing important puzzles.
And we have some redundancy saving rating steps.
In the test, we end with 548191 puzzles to rate, a small number of these puzzles being subsets of another puzzle as here
974..........2.9.....6.9......56..12...2.18.9....9865..1..52.....58.6...6..91.58. ED=10.2/10.2/2.6
974..........2.9.....6.9......56..12...2.18.9....9865..1..52.....58.6..16..91.58. ED=9.4/9.2/2.6
This is now a small number of grids to rate per solution grid (~5). This is good if we still have high rating hits in the loki family.
And in fact, I got 595 hits >=11.5 with the following distribution (skfr ratings)
- Code: Select all
11.5 78
11.6 286
11.7 222
11.8 9
Enough to think that, with such a cutoff, we should find similar families if any.
With a cutoff on puzzles with >=30 clues, the total number of hits goes down from 595 to 291 with the distribution
- Code: Select all
11.5 33
11.6 139
11.7 115
11.8 4
An option to consider although the risk of missing a new family increases.
Compared to this, the test done on solution grids with puzzles having high ratings in the potential hardest file did not deliver a rating over 11.2
Next post will show a change in the finder giving better timing results