release of 17 clues scan V7B

Programs which generate, solve, and analyze Sudoku puzzles

release of 17 clues scan V7B

Postby champagne » Sat Jan 27, 2024 9:30 pm

Although the job is not over, I think that it’s time to release the revision of the 17 clues search.
The goal is clearly the 18 clues scan of the ED solution grids, but assessing that a code is valid is not an easy step. So, I decided first to produce a revised code doing the 17 clues scan.
The repository for the code is https://github.com/GPenet/Sk_s17VB

The main test here is to check that all known 17 are found by the code, A full search has been done on expected triplets {band1 id; band2 id, band 3 id} in the three passes, giving the slice where the expected known came.
Note: words in bold/underscore are explained in the comments.
The corresponding files can be used to run a regressive test requiring around 10 seconds per known.

The repository contains or will contain in due time:
The files to compile the code, The code has been written using Microsoft visual studio, but it has been also compiled using g++. Currently, both versions are working here, one using a Linux computer.
A copy of the article written by Gary McGuire’s team to explain the proof that no 16 clues exists.
The files for the recursive test k1k.txt k2k.txt k3k.txt to use with the process 79 for each of the three passes.
The comments on this version of the code a word file named p17V7b. (to come)
A ToDoV8 word article explained below

The next step should have been to adjust the code to the 18 search, something easy to implement.
Writing the comments and reading again Mx Gary’ team article, I realized that I missed an important possibility to improve the code in the critical area. I am starting investigations in this direction, so a new version of the 17clues search should come first.The “ToDoV8” document describes shortly hat is missing.

As far as I know, nobody else is working currently on this topic. So this is a code shared just in case.
Having nearly stopped my work on the sudoku, this is the last topic on which I work to keep my brain active.
I know that many things are missing for anybody willing to dig in this code, but the door is open for more information if needed.
Last edited by champagne on Sat Jan 27, 2024 9:32 pm, edited 1 time in total.
champagne
2017 Supporter
 
Posts: 7490
Joined: 02 August 2007
Location: France Brittany

Re: release of 17 clues scan V78

Postby champagne » Sat Jan 27, 2024 9:30 pm

locked for comments on the V8 in due time
champagne
2017 Supporter
 
Posts: 7490
Joined: 02 August 2007
Location: France Brittany

champagne

Postby champagne » Sat Mar 02, 2024 3:17 pm

Writing comments, I flew over Gary McGuire’s article and stopped here.

7.1 Improving backtracking using higher-degree unavoidable sets
Here we will explain the most important of the three improvements we made to our hitting set algorithm
compared to the version of 2006...
We track these higher-degree unavoidable sets during the enumeration of hitting sets just like
the ordinary (degree 1) unavoidable sets, i.e., through the use of state and hitting vectors for each
degree. After twelve clues have been drawn, if there is an unavoidable set of degree 5 in our collection
that is not hit, then we may abandon the search and backtrack immediately. ....


Nothing new for me, in the second part of the process, looking for missing clues in band 3, we have a very efficient use of this concept.
In the 16 clues issue, higher degree unavoidable sets are built at the very beginning with a maximal depth of 5, in the band 3 search, we have a dynamic analysis of higher degree unavoidable sets of a much bigger depth.

In the V7B, expanding the UAs in bands 1+2 to find a valid 2 bands, no forward view is done. We have several steps where the remaining UAs are shrunk and sorted to use the smallest UA with still active cells, but no forward view.

With the experience of the work done in band 3 to optimize the process, we can expect a very good improvement using a dynamic construction of disjoint UAs with a depth of ¾.

This “forward view” allows premature backtracking, the best case, but can also, reducing the “active forward cells” speed up the appearance of unavoidable GUAs for the next branches of the expansion tree.

Example of a forward view

The simplest for me is to use an example taken in my investigations.
I use here a 2 bands partial solution grid having a 7 clues valid solution.
Band 1 bx 282 index 1-416 146.Band 2 bx 386 index 1-416 5.
123456789457189263698732145285691374764523918931847526

The primary harvest of unavoidable sets produced 1345 UAs, only 64 of them have 12 clues or less (to compare with Gary McGuire’s work). The smallest UAs are.
Code: Select all
1..1.....1..1......................................... 0 4
......................11................11............ 1 4
.........1......1.1......1............................ 2 4
.........1.1........................1.1............... 3 4
............................1.1......1......1...1....1 4 6
......................................1.1..1...1.1..1. 5 6
..........1..1...1..1.1...1.11........................ 6 8
..1.1.....1..1......1.1.....11........................ 7 8
..................11................11...1...11...1... 8 8
.............................11....1..11....1...1....1 9 8


Note : an easy way to see what is behind a given UA is to put it just below the solution grid.

Code: Select all
123456789457189263698732145285691374764523918931847526
......................................1.1..1...1.1..1. 5 6


Here for example, the multiple solution is a permutation of the digits 1,2,4 in the last 2 rows. Another way is to use a solver, with as given the solution grid without the cells of the UA
Code: Select all
12345678945718926369873214528569137476.5.39.893.8.75.6 (plus band3)

As the UA is supposed to have no subset, the puzzle should have 2 solutions.

In the search of 17 clues puzzle, the bands 1+2 can have 10 or 11 clues, depending on the pass.
As we will see, finding in dynamic mode 3 or more disjoint UAs is not so unusual. This means that examples of interest should be easy to spot with 7/8 clues.

Another point is that the probability of having a premature backtracking is in branches with dead cells.
To build the example, the start is done after a reasonable set of dead cells.

First three cells

In the expansion process, we have one step for the first three cells, then a reduce set of UAs is done, choosing the smallest residual UAs; Our start will be:
Code: Select all
1..a.....1..1......................................... 0 4
......................11................b1............ 1 4
.........1......1.c......1............................ 2 4


Doing so, we end the first step with 5 cells dead. Here is the top of the new table, used to choose the next 3 clues. The entire table contains 488 UAs.
The three first UAs of the new table are UAs 3;5;4 of the first table.

Cells 4 5 6

In the expansion process, we have one step for the next three cells, then a new reduce set of UAs is done, choosing the smallest residual UAs; Our second set will be:
Code: Select all
...........1........................d.1............... 0 3
......................................1....1...e.1..1. 1 5
............................1.1......1......f...1....1 2 6
..........1..1...1..1.....1.11........................ 3 7
..1.1.....1..1......1.......11........................ 4 7
.............................11....1..11....1...1....1 5 8
............................111....1.111....1......... 6 8
..1.1...1..........1......1..................111...... 7 8
..1..1.11....11..1..1................................. 8 8
....11.11....11..1........1........................... 9 8

...a..............c.................d....b..f..e...... 6 clues set


We now have 6 clues and 11 dead cells.
The next table now has 99 UAs with the following start;

Code: Select all
..........1..1...1..1.....1..1........................ 0 6
..1.1.....1..1......1........1........................ 1 6
..1..1.11....11..1..1................................. 2 8
....11.11....11..1........1........................... 3 8
.1..11.1..1..11....1.................................. 4 8
.11.1.1...1....1.1........1........................... 5 8
.11...1.1.....11.1.1.................................. 6 8
.11.1.1...1..1.1....1................................. 7 8
..1...1..........1...............11...........1.1....1 8 8
..1.1.....1......1........1..1...1............1....1.. 9 9

At this point, we have no evidence of a possible premature back tracking, we are still far from the final number of clues, 10 or 11 depending on the pass.
We will look at 2 examples after 2 more clues:
Code: Select all
...a..............c.................d....b..f..e...... 6 clues set
...1..............1.......11........1....1..1..1......
...1..............1..........1......1....1..11.1......

In this example, likely not the good process, clues 7/8 came out of 2 disjoint UAs of the table:
Code: Select all
..........1..1...1..1.....a..b........................  0 6
.....1..1.....11...........a...1..........1..b......11 19 10

We see in a;a and b;b the 2 cases, with respectively 8 and 12 more dead cells.
At this point, the table of residual UAs is interesting.

Example 1

Code: Select all
...1..............1.......11........1....1..1..1...... 8 clues
.11.1.11....1......1.1..11...1.11111...1..1..11.111111 cells not dead

16 remaining UAs (no subset)
Code: Select all
..1....1.............................................. 0 2
..1.1........................1........................ 1 3
.1..1..1...........1.................................. 2 4
.11...1............1.................................. 3 4
.11.1.1............................................... 4 4
............1......1.1..1....1........................ 5 5
...............................1...1......1......1..11 6 6
.1..1..1....1...........1.......1..................... 7 6
..1...1..........................11...........1.1....1 8 7
.11...1......................1...1............1....1.. 9 7
............1..................11..1............11...1 10 7
..1...1......................1...11....1......1......1 11 8
..1...1......................1...1............1....111 12 8
.1..1.......1...........1.......1.1....1..........1.1. 13 9
.1..1.......1......1............1.1....1..........1.1. 14 9
.1..1.11.........................11...........1.1....1 15 9
..1...1.....1................1..111...........1...11.1 16 11


Here, combining UAs 0;5;6 or 4;5;6 we have 3 disjoint UAs.
The minimum number of clues required here is 3, too many if the pass is limited to 10 clues, a “critical status” if we are looking for 11 clues in bands 1+2.

To reach 11 clues, each of the UAs 0;5;6 and 4;5;6 must be hit.
All valid 11 clues for the bands 1+2 will have the last 3 clues belonging to these UAs

This gives us three key properties:

Property 1: If the list of UAs has a disjoint group exceeding the number of residual clues to set, the branch is dead.
Property 2: If the list of UAs has a disjoint group equal to the number of residual clues to set, then all clues have cells belonging to the disjoint group. (critical group)


And a trivial Property 3: if we have several “critical groups”, then all residual cells must hit each group. In other terms, if each “critical group” is a 81 bits field, a “logical and” of the fields will give the possible residual cells.

If we apply this to our 2 triplets
Code: Select all
..1....1....1......1.1..1....1.1...1......1......1..11 uas 0;5;6
.11.1.1.....1......1.1..1....1.1...1......1......1..11 uas 4;5;6

We see that the cells 2,5,7,8 should be discarded because they are not in both triplets. Having UAs number 5 and 6 in both, we also could say that the common part (if any) of UAs number 0 and 5 must both be hit, what is possible only if they have common cells.

So, in this branch, all valid bands 1+2 with 11 clues will have cell 3 as clue, and the forward set of possible cells is now reduced to 12 cells instead of 16.

Assuming now the clue assigned to cell 3, and applying new dead cells we are left with this
Code: Select all
...................1.................................. 2 4
............1......1.1..1....1........................ 5 5
...............................1...1......1......1..11 6 6
............1...........1.......1..................... 7 6
............1..................11..1............11...1 10 7
............1...........1.......1.1....1..........1.1. 13 9
............1......1............1.1....1..........1.1. 14 9
................................11...........1.1....1 15 9


One more clue can now be assigned, hitting UA number 5. Then, UA number 7 has no clue hitting UA number 6. The branch is dead.
Note : Our 2 triplets share 2 UAs out of 3. If in the process these 2 UAs are seen first, we can find the common clues (if any) for the third UA, as in the next example.

Example 2
Code: Select all
...1..............1..........1......1....1..11.1...... 8 clues
.11.1.11....1......1.1..11......1111...1......1.111111 cells not dead
Again 16 remaining UAs
Code: Select all
..1....1.............................................. 0 2
....1..1.............................................. 1 2
.11.1.1............................................... 2 4
.11...1............1.................................. 3 4
...................1...............................111 4 4
...................................1.............1..11 5 4
............1......1.1..1............................. 6 4
............1...................1..1............11...1 7 6
..1.1.1............................................111 8 6
............1........1..1..........................111 9 6
...................1.....1.......1.1.............1..1. 10 6
..1...1..........................11...........1.1....1 11 7
..1...1............1.....1.........1.............1..1. 12 7
.1..1.......1...........1.......1.1....1..........1.1. 13 9
.1..1.......1......1............1.1....1..........1.1. 14 9
..1.1.1.....1...................1.1...............11.1 15 9
..1.1.1.....1........1..1.........1...............11.1 16 10


In this table, we have no disjoint group of size 4, but more groups of size 3.
{0,5,6} {1,3,5} {1,3,7} {1,3,9} {1,5,6} {1,6,11} {2,5,6} may be more.

Here we have 2 cases where the property 3 can be applied on part of these triplets:

{1,3,5} {1,3,7} {1,3,9} is an example of triplets having the same first 2 UAs.
We have only one common cell in the third one. It must be in any valid 11 clues in this branch.

Code: Select all
...................................1.............1..11 5 4
............1...................1..1............11...1 7 6
............1........1..1..........................111 9 6


{0,5,6} {1,5,6} {2,5,6} have 2 UAs shared. UAs number 0,1,2 have no common clue, this branch is dead.

Code: Select all
..1....1.............................................. 0 2
....1..1.............................................. 1 2
.11.1.1............................................... 2 4



These cases leading to a dead branch can be used within the search of disjoint UAs to speed up the process. The next example will show what can be done in a branch still alive but showing one or more “critical situations.

We are in the same branch for the 6 first clues and we are still looking for valid bands 1+2 with 11 clues.

Example 3

Code: Select all
...1..............1..........1......1....1..1..1....1. 8 first clues
.11.1.11....1......1.1..11......1111...1......1.1111.1 21 cells not dead

Here are 9 remaining UAs with a small number of cells in the first UAs.

With these 2 triplets.
Code: Select all
.11.1.11....1......1............1..1............11...1 UAs 1 3 5
..1.1.11....1......1.1..1........11...........1.1....1 UAs 1 4 6

To have both hit, the cells must be in this reduced field.
Code: Select all
..1.1.11....1......1............................1....1 all triplets

We can apply this limit to the table of UAs and we get:.
Code: Select all
..1....1.............................................. 0 2
....1..1.............................................. 1 2
..1.1.1............................................... 2 4
..1...1............1.................................. 3 4
............1......1.................................. 4 4
............1...................................1....1 5 6
..1...1.........................................1....1 6 7
..1.1.1.....1........................................1 7 9
..1.1.1.....1........................................1 8 10

We have no empty UA (dead branch) here, but this could come if the disjoint UAs search did not see it.

We can easily see that our triplets are still there. They are in the status in line with the “all triplets” field.

UA number 2 ( ..1.1.1 ) is a subset for UAs number 7 and 8. UAs 7 and 8 can be ignored.
This could happen for a subset of one UA belonging to a triplet (with a limit of one clue left in the UA). The “all triplets” field would then be reduced, and we must loop in the shrinking process.


Improving the valid GUAs search.


The UAs with 2/3 cells in band 3 are the most important for the work on valid bands 1+2.
We can have several thousand of them at the start split in sub tables, one per socket band3/bands1+2.
As soon as we have one UA in a socket sub table impossible to hit with the cells not dead, the corresponding socket is unavoidable for the rest of the branch. The situation of the cells not dead has changed from
Code: Select all
.11.1.11....1......1.1..11......1111...1......1.1111.1
21 cells not dead to
..1.1.11....1......1............................1....1
8 cells not dead.


We can have more UAs never hit later, giving more unavoidable sockets at this point.
And if we use vectors to find later UAs not hit, we have less bits to flag in the vectors.

Conclusion

For sure, this is of small interest for the first branches or subbranches of the hitting tree, but, as far as I could see in my preliminary investigation, a size 4 can come after 6 clues with a significant number of dead cells. A depth of 3 is relatively common with 8 clues.
A side effect of this search is the possibility to see if the next subbranches of the tree can produce shortly (1/2 clues) a potential valid “bands 1+2.”

Implementation of this is not a small task. The hitting tree in the V7B has 3 main steps with 3;6;9 clues.
This investigation started with the same 3/6 steps, but then with two 7/8 steps, one with the possibility to have a valid 9 and one without this possibility.
champagne
2017 Supporter
 
Posts: 7490
Joined: 02 August 2007
Location: France Brittany

First draft of the s17V9 available

Postby champagne » Sun Mar 03, 2024 9:00 am

I started the implementation of the dynamic early backtracking of the hitting tree in the 17 clues scan pass1.
It seems that looking for a critical High degree of size three (3 disjoints unavoidable sets) is a reasonable trade off.

As the pass1 is made of <= 10 clues in bands 1+2 and >=7clues in band 3, this leads to a first trial to have an early backtracking made with 7 clues.

The code has now passed the quick regressive test (finds all known 17 expected in pass1) and the tests against the V7B, although not as good as I hoped, give already a significant improvement in the runtime.

I created a new repository to save the corresponding draft.

https://github.com/GPenet/sk_s17v9

Next posts should come when significant results can be shared
champagne
2017 Supporter
 
Posts: 7490
Joined: 02 August 2007
Location: France Brittany

Re: release of 17 clues scan V7B

Postby champagne » Tue Mar 12, 2024 5:06 pm

I made 2 different tests with very different results.

The scan for the band index 32 showed a cut in the run time in the range 20/25%. A very good signal.
A test in the vicinity of known 17 showed a limited improvement (less than 10%).

This is with a dynamic search of disjoint unavoidable sets of size 3.
More tests are needed to optimize the process, but this is a good start.
champagne
2017 Supporter
 
Posts: 7490
Joined: 02 August 2007
Location: France Brittany


Return to Software