The New Sudoku Players' Forum

by **champagne** » Sun Sep 11, 2022 7:03 am

After the scan in V6 looking for 17 clues in solution grids,
after some work on one solution grid to find 18 clues having as reference Blue's DLL

I wanted to try to set up a 18 clues scan similar to the 17 clues scan, using the best recent findings.
I wanted also to make it public with as many comments as possible.

This is a long way and to stay on the safe side, I start with a revision of the 17 scan (V7) easier to test. The switch to 18 clues will come later.
Where I am, my best estimate is to have the V7 (17 clues) by the end of this year (2022) with all comments ready.

the repository for the code is here

I hope to have at the end a 18 scan as fast as the 17 scan used to check the 17 clues sudoku list.

I open the thread with several posst locked to have one post for each topic needing specific comments.

References:

It is difficult to give all sources of this program. Surely, many pieces of information came from the “new sudoku players forum”, but 3 direct or indirect contributors have to be mentioned:

zhouyundong_2012 who delivered a code performing well for a “brute force solver”. Many variants of this code are in use in this program.

Mladen Dobrichev who gave me the right way to produce unavoidable sets, but has long ago helped me in many technical aspects of the C++

And surely Blue who shared with me many things but mainly is the true inventor of the “divide and conquer” process applied in this program.

by **champagne** » Sun Sep 11, 2022 7:04 am

This a brief description of the process applied to search low clues in solution grids (5 472 730 538 ED “NED” solution grids).

The search of ED sudokus with a low count of clues has been considered in different ways:
To prove that 17 clues is the minimum to have a valid grid (unique solution), (published by Gary McGuire in august 2013)
To find all 17 clues puzzles (showing that the 49158 list was exhaustive) in the years 2017-2022
And, with the last version of the process, to explore the 18 clues field.

Starting with the 17 clues search, the process has completely changed. The general strategy has been established by “blue”, with impressive results.
See http://forum.enjoysudoku.com/bands-and-low-clue-puzzles-t30218-120.html

Many variants of the initial strategy have been tested, partly shared of just available through .exe or DLL.

Here is my last version for this process. I’ll add some comments when the strategy differs from blue’s attack.

A 2 bands attack

To do an exhaustive search, the best agreed way is to search each of the 5 472 730 538 ED solution grids. This process uses unavoidable sets to limit the search.
Note: an unavoidable set (UA) is a group of cells for which a clue is needed to avoid a multiple solution.

In the proof that no 16 exists, each solution grid has been considered separately.

Here, in three runs, the solutions grids are “sorted” to have all bands 3 attached to a pair {band1, band2}. The ED solution grids are morphed to attach as many solution grids as possible to “bands 1” having less valid bands.
Using UAs, possible valid {band1, band2) are produced, and then each of the bands 3 attached is checked using this valid {band1, band2}.

In the 17 clues search, as the number of valid {band 1, band2} increases sharply from 11 to 12 clues, the process is cut in 3 runs:
First run with any band/stack of the solution grids being once in band 3 with >= 7 clues in band 3
Then the 665 distribution in bands and stacks split to exclude the case 665 in bands
One with 566,656 distributions
One with the 656 distribution where bands 2 and 3 of the former run are exchanged.

In this process, UAs having 2 or three clues in band 3 play a key role. Other UAs with a small number of clues in band 3 are also used, but not as important as the former ones.
UAs with 2/3 clues in band 3 have all clues in a mini row. 2 different bands 3 can produce similar UAs, using a different mini row of the same stack. Seen from a pair {band1, band2}, such UAs have the same pattern if you just consider the columns and digits involved (here called socket). UAs reduced to the socket and the band 2 pattern have been called “gangster UA”.

If a {gangster UA size 2} is defined, it exists in all bands 3, but the number of clues in band 3 can be 2 (best case) 4 or 6.

There are many ways to implement this strategy, facing various problems. Blue started from a complete list of “valid band 1” and “valid band 2” in the range 3-7 clues. This was also the process in the V6 used for the 17 clues scan. In the V6, each band was expanded in live. Here, we have a direct expansion of the UAs belonging to bands 1 and 2 to produce the possible valid {band1, band2).

In all cases, Additional UAs (both UAs in bands 1+2 and GUAs) appear in the process. They are used downstream.
At the end, contrary to what was done in the 16 clues scan, this process uses a very big number of UAs and GUAs (thousands). Selecting UAs of interest and using them efficiently will be a key point.

The brute force control must be done at the end to see if a grid passing all UAs controls has a unique solution. This remains a costly task, even limited to bands 1+2. The check for bands 1+2 will be differed in many cases.

Running the process with several cores working in parallel, several problems appear lowering the process. Cache effect and memory bus conflicts for example should be considered.

Chapters summary.

The comments on the process are divided as follow:

Chapter B entry builder
Chapter C initial harvest of unavoidable sets
Chapter D Bands “1+2” unavoidable sets expansion
Chapter E Minimal count for clues in bands 3 filter,
Chapter F final process for bands 3.

And crossing topics:

Chapter G Life of bands 1+2 unavoidable sets
Chapter H Life of gangster unavoidable sets.

by **champagne** » Sun Sep 11, 2022 7:04 am

B_ Entry Builder

We must here process the ED solution grids in different ways. In all runs, the target is to get a list of ED solution grids morphed and sorted in the right way to enter the pairs band 1, band 2 and the attached band 3.

In the first run (band 3 with >= 7 clues), as each band/stack must be there at least once as band 3, several morphs of the same solution grids will be needed. We have here a lot of redundancy to clear to avoid NED x 9 bands 3.

In the second run, we need exactly the NED solution grids sorted and morphed to be in the optimized sequence.

In the third run, we have another sequence (and morph) of the same NED solution grids, but the pair band1, band2 can be erased if the minimum of clues in band 2 is 6 (very small number of cases)

In fact, in the second run, a tiny number of pairs will deliver no valid pair <= 11 clues. Several bands require 6 clues to be valid.

A built in ED solution grids generator has been coded to do that.

Bands 1 appears in increasing number of known valid 6 clues internal table. This order is the index 0-415 for the process.

Band 2 must have an index >= “index band 1” to avoid redundancy, consequently, when the index of band 1 grows, the number of bands 2 to consider goes done. This is a classical situation. The bonus here is that more bands 3 will be attached to bands1 having less valid solutions.

The entry builder clears as much as possible the redundancy, detecting various isomorphism
Inside band 1
When band1 and band 2 have the same Id 0-415
When the three bands have the same id
Diagonal lower index sequence...

As this is a very long process, The builder is in the program active for one band only. The minimum for the scan is 416 runs, one for each band 1, but smaller batches are possible and checkpoints are printed to limit the loss in case of failure.

Live builder of the entry file has a marginal cost in the process. Using a processor Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz , the full generation of the entry file for the pass2 was done at a rate of 50K solution grids per second. The run time is shorter for the pass 1, with an output six times bigger.

The summary of the entry builder is the following

Pass number of pairs number of solution grids ratio

Code: Select all: 1 983 959 110 32 835 497 303 33.37 2 610 163 364 5 472 730 538 8.97 3 606 240 269 same

Pass1 will be limited to 10 clues in bands 1+2 (for 17 clues in total). Without redundancy cleaning, we should have 9 x NED, we have a little more than 6 x NED.

Pass 2 and 3 will have 11 clues in bands 1+2, this will be by far the hardest part of the scan.

The ratio 8.97 (number of bands 3 attached to a pair) is somehow misleading going from about 19 in the first index to 1 at the bottom of the table (with several bands 1 having no pair produced). In all cases, the search of valid bands 1+2 is shared, but the process specific to the band 3 remains partly specific to each band. Optimizing the shared part of this process is a critical part of the code.

Pass 2 and pass3 have at the end a very similar number of pairs, Both, ignoring the small number of cases having no valid band 1+2 of size 11, have exactly NED bands 3.

In the coding strategy, the shared part will prepare the “per band 3” process as much as possible.

by **champagne** » Sun Sep 11, 2022 7:05 am

C_ UAs Builder

C1_ General

The UAs strategy is a very important point.

In the proof that we have no valid 16 clues, The UAs used were of size <= 12 and built using blueprints at the very beginning.
In this process, we consider UAs of bigger size, up to 20 cells (more in some conditions). All UAs having a subset are discarded.
As shown blue, the number of UAs increases sharply with the size. We have here to select “UAs of interest”.

Here after 2 examples of “interest”.

In a valid solution, a band is “valid against it’s gangster”. The minimum set of UAs needed to solve the band is well known and is described in a table in the program. For some bands, UAs of size 18 expressing the known constraint “can not have 2 rows empty in a band” are needed.
In any grid, we cannot have 2 digits missing. Again, this is an UA of size 18 if no subset exists.

The initial set of UAs.

In the process, we merge an initial set of UAs and UAs appearing in the brute force checks.
Preparing the initial set of UAs is a shared task for the group of attached bands 3. If a UA has a chance to be used in the process, having it at the start can save time later.
This pushes to have an initial set as big as possible. The 3 main parameters to consider are:

The cost to find such UAs
The probability to use it in the process
The cost in the process of a big number of UAs

Using Brute force to produce UAs

Here UAs are produced using the brute force process. For example, to find UAs for a pair of digits all cells of other digits are given, and the brute force solver gives all possible solutions.
With more digits, we must clear subsets. In fact, the process works with an increasing number of digits and the UAs found with a lower number of digits are used as constraints in the guess step of the brute force.
Several brute force variants study digits in the range 2-6 and sub grids of interest, mainly bands 1+2, but also groups of 4 boxes in bands 1+2

UAs format

The brute force used here is derived from the zhouyundong_2012.process. Each band is in a 32 field and the UA is in a 128 bits field.

Most of the work in the process in done with UAs of the bands 1+2.
A switch 2x32 <-> 54 bits is done later to be mainly in 54 bits mode.

For UAs of the whole grid, the rest is usually a 32 bits field UA for the band 3. As some of them play a specific role, A link to the corresponding band 3 pattern can replace the UA itself. (gangster pattern)

C2_ Gangster UAs and other UAs with clues in band 3

UAs having 2/3 cells in band 3.

From experience, we know that UAs having 2 cells in bands 3 play a key role in a scan where the first step is to find valid bands 1+2. Such UAs are in a mini row, usually disjoints, and often lead to a minimum count of clues in band 3 higher than the searched count.
To a lower extent, UAs having 3 cells in band 3, also in a mini row can increase the minimum count of clues.
For a solution grid, we can have 27 families of UAs with 2 cells and 9 families of UAs with 3 cells.
Here, we have several bands 3 attached to a pair bands 1+2. All UAs 2/ cells can be stored in gangster families depending on the {columns; digits} of the band 3. We have 81 such groups for UA2s (GUA2) and 81 groups for UA3s (GUA3).

Setup of GUAs an follow up

A GUA will be produced using the brute force for bands 1+2, adjusting the initial pm to the digits exchange in band 3. Having from 27 to 81 GUA2 and from 9 to 81 GUA3. Even if the total size of the GUA is limited to 18/20 clues, this gives a very high number of GUAs. The use of such UAs will be a critical code.

If for a valid bands1+2 a GUA is not hit in bands 1+2, one of the 2/3 digits must be there in band 3.
If for a given “band 3” the digits of the GUA are in the same mini row, we have the best case for the “band 3”;

UAs not GUA2 GUA3

If the expected mini row is not there, we still have a residual UA in band 3 to use. This residual UA can have 4/6 cells if it is a GUA2. Other UAs not of the GUA2 GUA3 families can appear in the final check. Such UAs can be shared under some conditions between several bands 3
In fact, each band 3 will have it's own table of such UAs. At the start, a very limited number of them is searched :
UAs with 2 digits not in the GUA2s GUA3s families
UAs of the stacks (out of the internal table of UAs) with more than 3 clues in band 3. Such UAs can have more than 20 clues, but as they are needed to expand the band/stack, they have a good chance to stop the final step of the scan.

The final number of such UAs is usually below 50, but can be over 100.

C3_ Initial Seed UAs pair

The initial step of the UA builder is to collect all uas 2 digits. This is done first for the bands 1+2, then for each band3 and the 3 bands giving the initial set of UAs with cells in band 3.
In many cases, the grid is solved in once if one given is true and we have a UA of size 18 (can not have 2 digits missing in a sudoku).

Here come all UAs of size 4, and many UAs of size 6,8,10,12,14. Each pair end with UAs totaling exactly 18 cells.
UAs are split to get those fully defined in bands 1+2, GUAs GUA3s and other UAs (UAas locked in band 3 are redundant with the internal table of uas)

C4_ Initial setup for bands 1+2

Once the initial step done, the builder works only in bands 1+2. The number of digits grows from 3 to 6, giving UAs in the range 2xN – 9N.

Note1: As the limit is 6 digits, all 12 digits UAs are there.
Note 2: The storage limit (number of clues in a UA) is here <25 clues. This is more a safety limit. In theory, we could reach 5x9=45 cells with 5 digits.

In this initial setup, UAs used in each “4 box” sub grid not yet there are searched.

This brute force uses previous found UAs. In the 4box step, these known UAs are the only filter for multiple solutions. (A direct expansion of these UAs could replace the current process.) We use here the same size limit of 25 clues for fresh UAs.

The initial number of UAs varies in a very large range. In my first sampling, we are in the range 100-1300 with an average of 600 for the “band 1” index 1 and 1000-2300 with an average of 1560 for the “band 1” index 400.

Finding the optimal cut off of the various parameters would require many tests. So far, looking for UAs with more than 6 clues seems too expensive (part of them come in the “4 box” step).

At the end, UAs of the internal table no yet seen are added (one band UAs with more than clues). Here, the cutoff is ignored.

C5_ GUAs initial setup

Each of the 81 possible GUA2 sockets could more or less produce as many GUAS as the initial set for bands 1+2. We can not get a code performing well with so many GUAs.

For a given valid band 1+2, a socket will force a clue in band 3 for any GUA2 not hit in the family. The strategy will be to pick up the smallest GUA2s accepting the risk to miss some sockets. (We have the same risk limiting the UA size for bands 1+2)

GUA3 sockets are of smaller interest; in most case, if a GUA3 socket is forced valid, a GUA2 socket subset of it will be valid as well. However, avoiding the last step of the process catching GUA3 socket with no subset is of interest.

In the initial setup, GUA3s of small size will be searched as well.

The other point is that we are only interested in GUAs that we can find for a valid band 1+2. Known UAs in bands 1+2 can be used as constraints in the brute force as in the UAs bands 1+2 generator.
The search is done step by step in creasing the number of digits. For a given socket, previous GUAs of the socket and UAs for the bands 1+2 are used as constraints.
Each of the 81+81 sockets is first filled with GUAs in “2x” mode (one band in a 32 bits field) to work easily with the brute force design.
Each table can contain 128 GUAs, but the GUAS with the biggest size are cleared when a “limit count” is reached. At the end, all GUAs exceeding a maximum count per socket are cleared.

At the end of the initial harvest, all is sent in ”54 bits mode” in another set of 81+81 tables. The sub tables are dynamically allocated to the initial size plus room for later addition of fresh GUAs. Cells killing all the GUAs of the table are identified in a “54 bits field killer”. This killer will speed up later the set up of the GUAs filters

A reasonable cutoff seems to be 30 gUA2s per socket and 20 GUA3s per socket.
Even with such limitations, we can have close to 2000 GUA2s and more than 1000 GUA3s.

by **champagne** » Sun Sep 11, 2022 7:06 am

D UAs expansion to produce valid bands 1+2

Finding first valid bands 1+2 for a band 3 assumed fully solved is the key concept of blue’s approach.

In the V6 process, valid bands 1+2 were searched using a matrix
Valid bands1 x valid bands 2, checking for each matrix position if all UAs were hit.

To do that, valid bands included redundant clues to have all possible valid solutions of “n” clues with n in the range 3-7.

Here the process has changed. Valid bands 1+2 are produced directly out of the UAs list, in several steps. Doing so, we can get solutions with less clues than the target, a special process will manage this case.

As in any such process, it’s important to use first the shortest UAs;

In the process, we can look for valid bands having <= 10 clues, 11 clues or 12 clues.

In the 17 clues search,

pass 1 has 7 or more clues in band 3 <= 10 clues in bands 1+2
pass2 has 6 clues in band 3 and a maximum of 6 clues in bands and stacks.

In the 18 clues search,

pass 1 has 7 or more clues in band 3 <= 11 clues in bands 1+2
pass2 has 6 clues in each band and each stack.

D.1 Minimum number of clues for 2 bands

We know that the minimum number of clues for 2 bands is 7 with only 4 ED such sub puzzles
This has been discussed long ago (2014) in this thread.

http://forum.enjoysudoku.com/minimum-number-of-clues-per-band-stack-t32120-105.html

We have no 17 with 7 clues in 2 bands, but some examples of 18 clues.
The number of valid bands 1+2 with no extra clue grows sharply in the area 7_12 clues.

A specific function is written for each family of such valid bands 1+2, adding extra clues to a valid band 1+2 to scan the area from 0 added clue to the maximum searched.
Although the process is very similar for the same number of free clues, each case has it’s own function

8to 10 9 to 10;
7to 11; 8 to 11;9 to 11; 10 to 11;
7 to 12; 8 to 12;9 to 12;10 to 12;11 to 12

The first reason to have separate functions is the different constraint to apply in bands and stack number of clues .
The second one is a safety measure. Mixing all cases to save coding add risks to have bugs.
The third one is that likely, at the end, the code will be split in 2 programs, one for the 17 search, one for the 18 search. The same remark will be valid for the main expansion functions.

Below 7 clues, if the list of still available UAs is empty, it can not be a valid pattern. Although this should not come with a big initial list of UAs, one missing UA is searched and added to the list

D.2 Dead cells appearing during expansion

In the article describing the proof that no 16 clue exists, this was well described.

Expanding UAs , we end with a set of cells hitting all UAs. Usually, several UAs contain more than one cell of the final set. If we have no filter, the expansion will produce many redundant sets.

Let’s take the first UA expanded. Any valid containing the first cell assigned will appear in the expansion of the first cell.

The same is true for the followings cells withing the same branch of expansion.
Such cells can be flagged as “dead”. The number of “dead cells” varies in a wide range. Let’s assume that the first 4 UAs are disjoints and of size 4. After 4 clues, the number of dead cells will be in the range 0-12 (assuming that the 4 cells of the set are non counted.

D.3 Remaining UAs status after n cells assigned

Let’s assume that 6 UAs have been assigned.
All UAs hit by these 6 clues are discarded
All dead cells in the remaining UAs can be ignored.

Erasing dead cells changes the residual count of cells in each UA. An empty UA can show up; This means that we have a dead branch with no possible new valid.
More often, this changes completely the order of the “smallest UAs” and new subsets can be seen,
Killing hit UAs, reordering UAs by residual size, applying new subset, we can build a reduced table of UAs lowering the work for the following steps.

Doing such reduction for each cell would not be of high interest;
In the program, this is done after a step of 3/4 cells where the smallest 128 UAs play a key role.

D.4 n cells expansion with a given set of UAs

The basic function in the expansion process will be to select n cells from a UA table.
N is 3 or 4 in the process, going from 0 to 6/7 clues, 6/9 if the final count is 12 (to be confirmed later).

This is done in similar functions, using a common table to store cell by cell data.

We can expect in most cases that the first (smallest) 128 UAs will be sufficient to expand 3/4 cells. These 28 UAs are followed through a vector updated in the process for each new cell.

Remaining UAs are checked later if the first table is fully hit applying the set of cells assigned.
Fresh UAs appearing in the process can be added to the table, but must not be in the first 128. If the total number of UAs is <128, the rest of the table for the first 128 is lost.

D.5 expansion of the last n cells

If we are considering the 17 clues search pass 1, we have 2x3 cells assigned as described above. The last expansion step is for cells 7 to 10.

This will produce
Potential valid bands 1+2 of size 10, the final count,
Potential valid of smaller size (size seven can be ignored)

If the first 6 clues are made of cell hitting a small number of UAs, this can produce no valid. At the opposite, this can produce hundreds of valid.

The expansion is done in once. The potential valid are stored in tables, one per size below the final count and one for the final count.

Each of these tables has a common number of clues >= 6 (first steps here). If a table has one valid, the common number is equal to the size of the valid. Often, we will be somewhere in between;

A good depth (number of residual clues to assign) seems to be in the range 3/4. At the moment, the draft for the 12 clues expansion is not yet done, but as we expect to have many 12 despite the band/stack constraint 666/666, the planned expansion is to have 3x3 steps before the last on of depth 3 also.

Expansion of the last clue

In all UAs expansion functions, the last clue must hit all remaining UAS;
This is very easy to implement, so, in any place where it is possible, the last step will combine first all still valid UAs to find common cells. In most cases, we have no common cell.

With more than one clue to assign, we have no easy way to simplify the process.

Fresh UAs coming later

Here, the UAs set is locked. If fresh UAs for bands 1+2 are seen later, they must be stored n a special table to be reused for this chunk of valid.

D.6 Summary of the expansion phases

We have the following structure:

Code: Select all: Search of 17 clues Pass1 (10) 3+3+4 Pass2 (11) 3+4+4 Search of 18 clues Pass1 (11) 3+4+4 Pass2(12) 3+3+3+3 (project to implement)

D.7 Sorting and shrinking a UA table by size

For UAs and later GUAs, we have similar tasks to cover

Deleting UAs hit by the known cells
Erasing dead cells from still valid UAs
Sorting such UAs by size
Erasing supersets.

As the sort operation is quite simple, the process uses vectors bit fields (one vector per size).
Bits are set to 1 in the relevant vector depending on the size of the UA, then UAs are taken by increasing size and sent to the final location.

In the final location, UAs are stored in blocs of 128 UAs, with bit field vector giving for each cell the UAs non hit.
As the first bloc has a specific treatment in the expansion process, the total number of UAs is forced to a minimum of 128 to push fresh UAs appearing later in another bloc.

D.8 After the last expansion step

If we have reached the final count, we enter the phase 2 of the process, the GUAs filter. At this point, the final check of the validity is not done.
The only thing to do is to control that no fresh UA in bands 1+2 from a previous valid of the same chunk kills the potential validity.

For the “valid below the final count” the final check is done in the last expansion step. We must here try as band 1+2 the given , but also any band 1+2 pattern obtained adding clues “up to the final count” with clues taken in the free cells (no dead cell) and in line with the constraints of the pass (eg: no band over 6 clues in a 12 clues pass).

Again, to stay on the safe side, we have several functions very similar processing each case with the relevant constraints applied.

by **champagne** » Sun Sep 11, 2022 7:07 am

E Minimal count for clues in band 3

This chapter could also has been called “Use of GUa2s GUA3s as main filter for a given potential valid”.

As written earlier, UAs having cells in band 3 located in a mini row are very important.
For a given set of clues in bands 1+2, We get a list of such UAs patterns in band 3 not hit (fully hit) in bands 1+2.

This gives a minimum count of cells needed in band 3 (out of other UAS).

One gua2 in a mini row => one clue
2 guas2 in a mini row => minimum one clue , possibly 2
3 guas6 in a mini row minimum 2 clues
A gua3 without gua2 minimum one clue.

By experience, we can say that in most cases, this is more than the number of clues to assign in band3.
In other terms, for a given solution grid, most of the valid bands 1+2 will not pass a filter on the minimal count of clues in band 3.

Having several bands 3 attached to a pair band1;band2, we consider first the 81 sockets of the gangsters UAs with 2 cells and the 81 sockets for the gangster UAs of size 3.
In the opening tasks, a maximum is done to switch easily from socket to pattern and reverse, from a socket to a band GUA2 GUA3...

For each socket, we can have many GUAs. Once the clues in bands 1+2 locked, one GUA not hit is enough to make the band3 socket unavoidable. To have a valid global solution, all UAs must be hit.

And we have to do the best use of fresh GUAs produced during the process in the final check of a potential valid 17 or 18.

Last but not least, we can have thousands of such UAs if the cutoff of the UAs size is big. Even with a small cutoff, when many bands are attached, we have 162 sockets x average number of GUAs per socket to use, so several thousands of such UAs will be common.

We have here the main parameters to explain how these GUAs are organized in the process.

E1 From the start to the last expansion in bands 1+2

We have no use of the GUAs as long as we don’t have the valid bands 1+2 (potential valid). What is done upstream is to prepare as much as possible the “per valid band1+2” task.

Each assignment in bands 1+2 can reduce the number of still not hit UAs. Usually, some sockets are fully hit by the cells of the digits in bands 1+2 corresponding to the socket.

GUAs are stored with one table per socket with room for fresh GUAs. For each socket, a “killer” is built having all clues common to the GUAs of the family.

An empty socket (no GUA) will have a killer set to all ‘1’ (always hi)t.
A socket having a GUA with only dead cells in bands 1+2 will have an empty killer and no Gua (never hit, always forced in band 3)

A reduced table is built after the final expansion if some valid have been seen.
This table structure is identical to the start structure. Dead cells and fresh subsets are applied to reduce the size of the table.

This is of small interest if we have only a small number of valid bands 1+2 with the expected number of cells, but it will be a shared task for the average number of valid of the last expansion step and a shared task if any valid below the expected number of cells shows up.

Note : in the 18 search pass2 (12 clues in bands 1+2) the probability to have no valid after the first 6 clues is very low. A first reduction of the GUA2/3s tables will be done after 6 clues. A second on will be done after 9 clues, if the last expansion (10-12 clues) is not empty.

E2 after the last expansion in bands 1+2

If the final count is reached, having checked that no fresh UA kills the validity, we can extract all valid sockets 2/3.
Then, we have to call the final loop on attached bands 3l, with this list of valid sockets.

Here takes place the reduction of the GUAs tables, using the common cells of the set of potential valid having the final count.

For each of the sets of the potential valid below the final count, we have the same process.

The reduction of the GUAs table is done at the start (generally with more than 6 common cells) of the process of each group of a given size.
All “final status for bands1+2” are produced
And for each of them the extraction of sockets and the loop on bands 2 takes place as in the first case.

Note : we could here think of other reductions of the sockets table when a clue is added (if several clues can be added);

E3 minimal count in a band 3

We start the process for a band 3 with the list of valid sockets (2/3). These valid are given in 2 x “81 bits fields” to combine to the list of possible sockets of the band 3 built in the same way in the opening tasks.

The status of the 9 mini rows is built out of this sockets status;
Four 8 bit fields tell us which mini row contains 1,2,3 sockets 2 or a socket3

Sockets 3 are not considered if a socket2 (subset) exists in the same mini row.

This gives us the minimum count of clues in band 3; If this count is over the expected number of clues, the job is over.

Minimal count field : combining all pairs/triplets of the minimal count, we get the minimal count field. Any band 3 UA having common clues with the minimal count field will be said “in field”; others will be said “outfield".

If we have band3 UAs outfield, the minimal count can be increased: by 1 if one cell (or more) hits all UAs outfield, by 2 if no cell hits all the UAs out field.

This property is used in due time in the process to stop as soon as possible.

E3 minimal count per stack and critical stack

We can define a minimal count per stack as the count of known assigned plus the pending minimal count for the stack. No clue of this count can be added out of the cells contributing to the minimal count.
If this count is over the stack constraint, again, the band 3 is over
If this count is equal to the stack constraint, we have a critical stack.

Note : if the minimal count is equal to the number of clues to add in band 3, all stacks are critical.

The critical stacks play a key role in the second phase of the process.

In a critical stack, all cells not assigned and not in the minimal count field are dead.
In any mini row of a critical stack containing 2 sockets, the common cell can be assigned;

Both properties are very important.
Any assigned cell makes the final process much simpler. Some “other UAs” are hit, the minimal count field is reduced ;

Applying dead cells to “other UAs” , we can meet one of the 2 classical situations :
An empty UA then the process can be cancelled for this band 3,
An UA reduced to one cell to assign as well. Such assign can be “out field”, increasing the number of clues in the corresponding stack. If several assign have to be done, we can have a conflict with too many clues to assign in a stack.

E4 process after effect of sockets minimal count

If the filter on the minimal count is not active(not too many clues forced by GUA2s GUA3s), the process tries first to take benefit of all above remarks.
Critical assignments are done if possible.
UAs not “sockets UAs “ are extracted and reduced (socket UAs are also entered in the table)
We have here UAs of the band 3 and the band 3 pattern of all UAs not GUAs and not hit. (see below for the life of UAs not sockets UAs )

This is explained in the next chapter

E5 life of UAs not sockets

UAs not sockets are UAs with clues in at least 2 bands part of them in band 3.
Such UAs are found here in three ways :

UAs with 2 digits produced at the very beginning,
Stack UAs produced at the end of the UAs harvest,
Fresh UAs produced in the final check.

Although it would be possible to share such UAs between bands 3 having the appropriate pattern, They are just stored in each relevant band 3 In the limit of 300 UAs for a band 3.
As they have no use before the filter on the minimal count, we could use directly the base table to extract the band 3 part of UAs not yet hit.
To try to share part of the extraction, at the first need of these UAs for a given band, a preliminary reduction is done using the common clues of the chunk.

The use of these UAs takes place after the check of the minimal count and after the first assignment of cells in critical stacks.

by **champagne** » Sun Sep 11, 2022 7:07 am

reserved

by **champagne** » Sun Sep 11, 2022 7:08 am

reserved

by **champagne** » Fri Oct 14, 2022 1:51 pm

slow progress, but things are going in the right direction.

I continued to draft the V7 and to write comments. the source for the comments can be viewed here
https://gpenet.pagesperso-orange.fr/P17/p17_18.htm
this will be the first place for updates of the comments, updating a long post is not so easy.

Tests on the new "17 search pass 1" have reached a "beta test" level.
Tests on the" 17 search pass2a 2b" have been started. Most of the code is common to all cases.

Next step with 90% of the code common to the 17 search will start just after.
Is still missing the 18 clue search pass 2 (again 90% of the code is there).

The repository contains plenty of testing and debugging code. The code will be cleaned later. Testing code is needed to-day

Any remark on the comments or on the code is welcome.

Benchmarks against the V6 used for the proof that all 17 are known should come in one or 2 weeks.

by **champagne** » Sat May 06, 2023 2:58 pm

Some months later...

After work with mathimagics leading to significant changes in the process, we have now started the scan 18 in a version containing filters to scan only solution grids of interest for the mathimagics's project, we can say that we have the beta test version to redo the 17 scan partially as complementary test and for a full 18 scan if necessary.

The current code contains plenty of debugging sequences. I am late in several tasks

cleaning the current code to have a code slightly faster and easier to read
updating the comments
reducing the program to a "one solution grid" analysis.

in the tests, we have intensively used blue's DLL analyzing a solution grid. We did not find any wrong answer to the existence of a 18 in a given solution grid, but we have seen some redundancy in the solutions (and the same in the scan), so the count of solution can be too high, as far as we can see, this happens when several bands have the same canonical form.

The process remains globally the process applied by blue. The two main changes are

A direct expansion of the unavoidable sets for the bands 1+3,
No use of the stacks in the process. (in blue's process, stacks are scanned for puzzles with >=7 clues in stacks)

I'll do my best to implement the missing tasks

The New Sudoku Players' Forum

17clues V7 scan 18 clues scan

17clues V7 scan 18 clues scan

Abstract

Entry builder

Harvest of unavoidable sets

Bands 1+2 UAs expansion

minimal count band 3 filter

final process band 3

crossing topics

Re: 17clues V7 scan 18 clues scan

Re: 17clues V7 scan 18 clues scan