The New Sudoku Players' Forum

by **champagne** » Sun Nov 06, 2016 11:45 am

dobrichev wrote:Thinking about earlier dead-end branch identification, the UA of size 4 and 6 are hit with the first few clues and are out of interest.(Is this true for an average grid or the number of exceptions is ~100% ?)

Hi mladen,

For sure, early detection of a dead branch is a key point (the key point ??)
Mc Garry made some negative comments on another process using late building of higher degree UAs. I am not quite sure that this is definitely true.

Regarding your point, let me answer in the following way.

The table of blue shows an average 27 UAs of size <= 6.

Even if the maximum number of disjoint uas in that lot is weak (says 4 to 5 for example) each branch hit only a small number of them at the start.

Usually after ten clues taken in the smallest uas, you still have available 4_6

here is an example out of the "test solution grid above".

We are on the path leading to the 17 seen, so this is a king of worst case for your "conjecture".
With 10 clues, your conjecture is nearly verified, so I take the situation with 8 clues of the expected 17.
the last item shows the next clues in the path leading to the 17

we start in bands 1+2 with 36 uas of size <=6.
Here is the status

Code: Select all: ......7.........2....1.....2..........79......3......4 ..1.....1.....1..1..1..1.............................. i=11 ....1..1..1..1.....1.....1............................ i=13 .....1..1..1..1.....1.....1........................... i=14 .1..1..1...........1..1..1............................ i=23 .1..1.....1..1.....1..1............................... i=26 ...........1..1..1..1..1..1........................... i=30 .....1..1.....1..1.....1..1........................... i=31 ..1..1.....1..1.....1..1.............................. i=32 ..1..1..1...........1..1..1........................... i=33 ..1..1..1..1..1..1.................................... i=34 ..1.....1..1.....1..1.....1........................... i=35 ......7.........23.8.1.....2..........79......3......4 next clues

by **dobrichev** » Sun Nov 06, 2016 1:38 pm

hi Champagne,

You are right. Actually the number of UA of different size per grid doesn't fit well in the subject of this topic. I meant the short UA hit in the "usual" process which is not related to band distinction but only to preffering (the effectivelly) short UA hitting.
Statistics for which UA are actually used could be collected during a run over the several hundreds of grids. Partial run only to the point where UA of size 4 and 6 are exhausted would work for that purpose too. Of course full run over thousands of grids would give more valuable information for n-5th to n-1st placing. Unfortunately I haven't working code where I could add counters.
p.s. Even more, from 2 months I have no working PC at home

by **champagne** » Fri Nov 11, 2016 7:35 am

Hi mladen,

The search for a kind of "best process" is very slow, but the current status of my draft open a good window.
I am now working on a solution grid morphed to have the highest "minimum number of clues in band" (M3 later) as of your table in the first post of that thread as band 3.
I then try to find as solution in two steps

Find a set of cells hitting all uas of the bands 1+2 leaving enough free clues to reach 17 with the M3
Try then to solve the entire grid using UAs hitting band 3.

This is simpler that what I intended to do earlier. The major reason, out of the test data, is that any intermediate call to the" brute force" to check a partial solution is very costly.
One side constraint is that the process in Bands 1+2 should not fall short in UAs (unless in exceptional conditions) to avoid a call to brute force looking for more UAas.

Regarding your comments on UAs of higher degree and early detection of dead branches (having no chance to produce a 17), here the eliminations can start with M3+X12 free clues where X12 is the degree of high degree UAs on bands 12.

So far, I am not using as Mc Garry a table of UAs of high degree at start, instead, I just use "on the spot" a quick and dirty check of a very simple high degree UA, but, this works well for UAs degree 2 and partially for UAs degree 3. I have a small number of dead branches seen with UAs degree 4.

Some data from the test grid I am studying.
The original grid (as shown earlier) was

Code: Select all: 123456789456789123789123456214368597867945231935217864348672915591834672672591348 ......7.........23.891.....2..........79......3.....64..8...91.....34............ target

In that grid band 3 has a M3=2;
Band1 requires a minimum of 6 clues to be solved. The solution grid is morphed to hae it as band 3, so the puzzle studied is

Code: Select all: 348672915591834672672591348214368597867945231935217864123456789456789123789123456 morph of entry ..8...91.....34............2..........79......3.....64......7.........23.891..... morph of target

In that test, I used

All clues size <= 12 as of Mc Garry tables

plus
All clues of 2-4 digits of size <=18
and limited to bands 1+2
some (not all) clues size<=18 with 5 and 6 digits.

The search starts with
723 UAs in bands 1+2
202 UAs including cells outside bands 1+2

With M3=6, the process for bands 1+2 stops with 11 clues. This limits drastically the number of valid solutions for bands 1+2. My final expectation is to get less than 100 000 such solutions at the end.

After several fruitless attempt to apply another strategy, I am selecting here dynamically the UA with the smallest number of active cells.
The program run still short in uas 474 times with 9 clues and 125892 times with 10 clues. My next priority will be to test other limits in the uaS collection to lower that lot.

Another interesting point is to see what was the status of 202 UAs when the task was covered in bands 1+2 for the expected 17.

Code: Select all: ..8...91.....34............2..........79......3.....64......7.........23.891..... morph of target ................................................1..........................11.... i=0 ..................11................1...................................11....... i=29 ..........................................1..........................111......... i=10 ......................................................1..1.....1.....1.....1..1.. i=1 ......................................................1.....1.....1..1..1..1..... i=2 .......................................................1..1.....1.....1.....1..1. i=3 .......................................................1.....1.....1..1..1..1.... i=4 ........................................................1..1.....1.....1.....1..1 i=5 ........................................................1.....1.....1..1..1..1... i=6 .........................................................1..1..1..1.....1.....1.. i=7 ..........................................................1..1..1..1.....1.....1. i=8 ...........................................................1..1..1..1.....1.....1 i=9 ......................................................1..1..1...........1..1..1.. i=11 ......................................................1.....1..1.....1..1.....1.. i=12 .........................................................1..1.....1..1.....1..1.. i=13 ......................................................1..1..1..1..1..1........... i=14 ...............................................................1..1..1..1..1..1.. i=15 ......................................................1..1.....1..1.....1..1..... i=16 .......................................................1.....1..1.....1..1.....1. i=17 .......................................................1..1..1...........1..1..1. i=18 .......................................................1..1..1..1..1..1.......... i=19 ..........................................................1..1.....1..1.....1..1. i=20 .......................................................1..1.....1..1.....1..1.... i=21 ................................................................1..1..1..1..1..1. i=22 .................................................................1..1..1..1..1..1 i=23 ...........................................................1..1.....1..1.....1..1 i=24 ........................................................1..1.....1..1.....1..1... i=25 ........................................................1..1..1...........1..1..1 i=26 ........................................................1..1..1..1..1..1......... i=27 ........................................................1.....1..1.....1..1.....1 i=28

In that table, 3 UAs have clues in bands 1+2 that we can forget, others are original UAs located in band 3, never hit during the search in bands 1+2.
For the first 3 UAs, the size in band 3 is limited to 2 or 3 cells. They will have the priority in the band 3 expansion and can produce more high degree UAs.

Still a lot of work, but the achieved run time of the test let me think that this is a good process.

by **dobrichev** » Fri Nov 11, 2016 7:27 pm

Hi Champagne,

Several questions to you with hope that even left unanswered might provoke new ideas or clarify your logic in this approach.

Your 6-clue band 3 is the best case and far not average. From other side you worked on a grid having a valid 17-clue puzzle which is pushing it to the harder than average end. If your code is sufficiently stable, could you run it on a more random grid and compare the counters and timings?

What is the order of magnitude of the processing time? Minutes or hours?

Are you collected all the 11-clues band 1+2 candidates as a first step and then expand each of them? Do you have a count of these candidates? (you wrote "My final expectation is to get less than 100 000 such solutions at the end.").

What is the portion of the processing time used for generation of the 11-clue candidates? Is it worth these results to be reused in the context of the same bands 1+2 but other 6-clue band 3 completions? (For reusing, you need also to abandon in the first stage all UA related to band 3.)

The 6-clue bands are not that rare compared to 3-templates (see this topic). This gave me the idea to do a pass over all ED solution grids and count how many of them can be discarded due to existence of 3 bands (in band or stack direction), each of them internally requiring 6 clues. Execution should take less than a day and directly give some gigabytes of black listed grids (compared to the other not easy to use black list with 17-3=14 clue subgrids that couldn't be expanded to new 17s).

by **champagne** » Sat Nov 12, 2016 9:01 am

Hi mladen,

some answers where I can do it

dobrichev wrote:.
What is the order of magnitude of the processing time? Minutes or hours?

Let's start from existing process (Mc Garry proof that no 16 exists).

Mc Gary claims an average 3-4 seconds per solution grid.

Blue has an executive of the Mc Garry code applied to the search of 17s and announce an average 20 seconds per solution grid (this seems on the low side, but blue is highly reliable)

On a solution grid he tested for me, blue came back with about 120 seconds for that solution grid, giving an idea of a possible distribution.

Unhappily, I don't have an exec out of the Mc Garry code, so I can't benchmark my own draft against it.

But anyway, any alternative process must be run in some seconds, not more.

So far, my code stops at the end of the bands 1+2 expansion. It is purely C++ against a mix of C++ and assembly code for Mc Garry

I summarize the results for the 2 solution grids I have in the preliminary test, done with a cut of at the size 20 for the UAs

Code: Select all: UAs bands 1+2 1169 1075 UAs band 3 27 27 (same band) Other UAs 768 768 (forced limit) last step lack of UA 92251 36887 exec solving lack of uas 4.1 s 1.7 s (using brute force) exec without solving 1.7 s 1.2 s

That table gives the main parameter influencing the run time.
One can see that the brute force is very costly, so, if the process does not fall short in UAs for bands 1+2, the uniqueness of the solution is not checked. we have many more solutions where all UAs hit exactly with the maximum number of clues to consider (here 17-6=11)

dobrichev wrote:.
Your 6-clue band 3 is the best case and far not average. From other side you worked on a grid having a valid 17-clue puzzle which is pushing it to the harder than average end.

Right. As far as I remember, the average for the minimum distribution of the band having the highest minimum of clues is around five. I have no idea for the average effect on the key parameters.

dobrichev wrote:.
If your code is sufficiently stable, could you run it on a more random grid and compare the counters and timings?

I am not organized to pick up random grids in the catalogue, instead, I extracted a sample in the mode "one every ..."

But the current code is so far limited to the solutions for bands 1+2. I should have drafted next week the last steps.

What I intend to do is

a) check on all solutions grids having 17s that you can catch them
b) apply the process to the sample file "one every .."

In each case, I'll track some statistical data

dobrichev wrote:.
Are you collected all the 11-clues band 1+2 candidates as a first step and then expand each of them? Do you have a count of these candidates? (you wrote "My final expectation is to get less than 100 000 such solutions at the end.").

I implicitly answered above, but yes the bands 1+2 are first expanded; Regarding the count, I confess that my wording was misleading. I was focusing on the shortage in the UAs table that I can only solve using the brute force (any other idea??)

I don't have yet the final count for partial solutions (unchecked for uniqueness) of bands 1+2, but we are talking of "millions".

dobrichev wrote:.
What is the portion of the processing time used for generation of the 11-clue candidates? Is it worth these results to be reused in the context of the same bands 1+2 but other 6-clue band 3 completions? (For reusing, you need also to abandon in the first stage all UA related to band 3.)

You have the answer for timings in the above table, but I have doubts that you can find many solution grids having the same bands 1+2 and the same count in band 3;

Another point is that during the expansion of bands 1+2, the table of "other" UAs is shrunk preparing an efficient last step.

dobrichev wrote:.
The 6-clue bands are not that rare compared to 3-templates ....

This is not clear to me, I have to read first your thread

by **dobrichev** » Sat Nov 12, 2016 6:52 pm

Thank you for the answers.

Gary McGure's code takes over 8 minutes for one of the top 3 hardest grids - these that are short in small UA sets. And this is in 16-clue search. For the grids with 17s, from memory, 2/3 take below one minute.

I found two weak points in your approach.

1. You wouldn't find a puzzle having 7 or more clues in band 3.
2. Moreover, existence of such a puzzle will put you in "lack of UA" - instead of passing the 10-clue subgrid to the second stage, you will examine and pass 54-10=44 dead-end 11-clue subgrids.

by **champagne** » Sun Nov 13, 2016 7:12 am

dobrichev wrote:I found two weak points in your approach.

1. You wouldn't find a puzzle having 7 or more clues in band 3.
2. Moreover, existence of such a puzzle will put you in "lack of UA" - instead of passing the 10-clue subgrid to the second stage, you will examine and pass 54-10=44 dead-end 11-clue subgrids.

Hi mladen,
you are right, but I would use "tougher cases" instead of "you would not find a puzzle".

In fact, if the code is correct, the expansion fell short in UAs 302 times below 11 clues in our example.
Again, assuming that the code is ok, in 8 cases, the bands 1+2 was solved, and this is the basic situation that you describe.

The first case was the following

solution for bands 1+2 (I'll check to-day that the solution is unique for bands 1+2, but we can consider it as ok for that post)

Code: Select all: .....29....1.3......................8...4..3......7.6

At that point, the remaining UAs in the table of "others is the following (2 blanks to show band3 apart)

Code: Select all: ............................1......................... 11......................... i=0 ................................................11.... .....................11.... i=1 ..................1.1........1........................ .........1.1............... i=2 .........11..................................11....... .........11................ i=3 ...................11.........................1....... .11.......11............... i=4 ........1...1..1.....1....1........................1.1 ........................1.1 i=5 .........1..1..1..1..1....1........................1.1 ........................1.1 i=6 ....1.......1.1................1...................... ...1.1......11............. i=7 .....................11........1.......1.............. ....11.......11............ i=8

At that point, you can add one clue out of the left part of the table, and then you would miss the 17 with 7 clues in band 3, but you can also go directly in band 3 and then, you would find your 17.

In fact, my concern would be the opposite (somehow your point 2): the process must cover any case where a valid clue of interest can be added in bands 1+2. I am not 100% sure that the above list is the answer. The worst case is not 54-10, but "non dead cells in bands 1+2" at the time of failure, which will usually be lower.

But in our example this is only 8 sub grids. The count would likely be much higher with a minimum of 4 clues in band 3.

EDIT : more about the table of "still valid others"

Assuming that the next clue in band 1+2 is taken out of the left part of the table, the right part becomes UAs for band3. With many 2 cells uas, you can in some cases pass the degree to have a chance to find a 17. And we can reverse the process trying first to solve band 3 with the right part of the table and band 3 uas

If the cell is taken in the left part, we still have as UAs for the last band most of the right part of the table,

so this should not be to costly to process.

by **champagne** » Sun Nov 13, 2016 10:01 am

dobrichev wrote:Gary McGure's code takes over 8 minutes for one of the top 3 hardest grids - these that are short in small UA sets. And this is in 16-clue search. For the grids with 17s, from memory, 2/3 take below one minute.

Hi again mladen,

I added an edit to the previous post, and I have a side question.

Can you produce the reference time for the solution grid I am using as first validation of my code applying the Mc Garry process (for a 17)

Code: Select all: 348672915591834672672591348214368597867945231935217864123456789456789123789123456 morph of entry

And why not, for the second in test, not too far from the first, but with significantly different timings in my code (shorter)

Code: Select all: 214378695867295341935641872348962517591837264672514938123456789456789123789123456 morph of entry

by **champagne** » Wed Nov 16, 2016 10:30 am

Just some news about the 2 examples above.

After the first test covering most of the cases (excepting the fall short in uas), we can expect a 17 search around 30 seconds for the first test and around 15 seconds for the second one
( current test made on a 3.4 GHz overcrowded i7 with 6 batches run in parallel).

I changed slightly the lot of UAs used at start with now

Code: Select all: - all <=12 cells UAs as of Mc Garry tables (unchanged) - all 2 to 4 digits uas (unchanged) - some 5-6 digits uas of size <=20 (limit pushed from 18 to 20) - all bands uas (see first post in that thread)

I also extended the limit for the table of UAs not in bands 1+2 at the start.

Unhappily, it seems that mladen can not produce the reference time with the Mc Garry code.

Adding all bands uas did not deteriorate the run time in these 2 cases, but did not improve significantly the process. Nevertheless, I think that generally speaking, it should be a good choice.

by **dobrichev** » Wed Nov 16, 2016 5:42 pm

Champagne, your results are very good. Congratulations.

Yes, the environment where I tested McGure's code doesn't exist now and I can't do the requested timings. I forgot to note this, sorry.

by **champagne** » Thu Nov 17, 2016 2:29 am

dobrichev wrote:Champagne, your results are very good. Congratulations.

Still too early for any congratulation, such partial work is always done under the threat of a big hidden bug, but reversely, sharing early findings can help to go in the right direction through contributions of the community.

At that point, the final result can still be in a range one to three depending on main parameters optimization and unexplored tracks.

The main unexplored track is the list of valid minimum sets of cells per band. I pick some values in your table at the top of the thread

Code: Select all: band N° Minclues "valid band3" with min clues 1 6 729 3 4 108 7 3 6 32 5 546 33 5 63

Most of the calls to the final step (solving band 3) are done with an untouched band 3 and a short list of non hit UAs in bands 1+2, looking for solutions with the minimum number of clues giving a unique solution of the puzzle.
Instead of using the UAs in band 3, or in parallel to doing that, one can use the list of "valid band3". This will clearly be the best way if the band 3 corresponds to the number 7 (min clues 3, 6 "valid band3"). Applying a selection of solutions already hit in the same way as McGarry cleared hit UAs of high degree, this can give a very efficient strategy for the final phase.

My priority will be to explore that, but I need first to reopen the corresponding old code and to work out some tables of "valid band3" to apply to a solution grid. Likely one or two weeks delay

Less exciting, the same can be done for min clues +1; min clues +2, but this seems of interest only when the last band has a low "minimum number of clues"

by **dobrichev** » Thu Nov 17, 2016 7:29 am

I am not surprised. The goal of enumeration of clue combinations required to resolve particular band is to use them in tabular form.
Isn't it more intuitive bands with more minimum required clues to benefit more from tabular matching? Comparing bands 7 and 8, for 5 clues the ratio is ~15/1.
"Touching" band 3 for some reason in the preliminary steps even truncates the (table of) valid clue combinations within it, and I see no problem in handling these cases.
Using pre-built indexes of the clue combinations in the way similar than UA are indexed would help for the larger tables, say these with > 50 combinations.

by **champagne** » Thu Nov 17, 2016 9:44 am

Hi mladen,

dobrichev wrote: Isn't it more intuitive bands with more minimum required clues to benefit more from tabular matching? Comparing bands 7 and 8, for 5 clues the ratio is ~15/1.

Here, you jump quickly in the situation of solutions with redundant clues. This is by far not the key situation in my preliminary test, (but done with a band3 of 6 as minimum number of clues). I keep that for later.

But the number of valid minimal solutions is still in the range { 3 - >1000 }
Another optimization factor is to discard all bands that will not appear as band 3 in the morphing process. This can help to go later in your direction.

dobrichev wrote:"Touching" band 3 for some reason in the preliminary steps even truncates the (table of) valid clue combinations within it, and I see no problem in handling these cases.
Using pre-built indexes of the clue combinations in the way similar than UA are indexed would help for the larger tables, say these with > 50 combinations.

Hetre, we are fully in-line. In that process, the band 3 remains un touched as long as the bands 1+2 don't have a valid solution, but then we fall in your description, Let me remind the above example.

solution for bands 1+2

Code: Select all: .....29....1.3......................8...4..3......7.6

At that point, the remaining UAs in the table of "others is the following (2 blanks to show band3 apart)

Code: Select all: ............................1......................... 11......................... i=0 ................................................11.... .....................11.... i=1 ..................1.1........1........................ .........1.1............... i=2 .........11..................................11....... .........11................ i=3 ...................11.........................1....... .11.......11............... i=4 ........1...1..1.....1....1........................1.1 ........................1.1 i=5 .........1..1..1..1..1....1........................1.1 ........................1.1 i=6 ....1.......1.1................1...................... ...1.1......11............. i=7 .....................11........1.......1.............. ....11.......11............ i=8

In such a position, trying to solve the band 3 with a valid global puzzle, the logic is to start with "pseudo UAs" of size 2 and to select the "valid bands3" still active.

This is at the end very similar to the filter with High degree UAs

by **blue** » Fri Nov 18, 2016 6:48 pm

champagne wrote:I added an edit to the previous post, and I have a side question.

Can you produce the reference time for the solution grid I am using as first validation of my code applying the Mc Garry process (for a 17)

Code: Select all
348672915591834672672591348214368597867945231935217864123456789456789123789123456 morph of entry

And why not, for the second in test, not too far from the first, but with significantly different timings in my code (shorter)

Code: Select all
214378695867295341935641872348962517591837264672514938123456789456789123789123456 morph of entry

dobrichev wrote:Yes, the environment where I tested McGure's code doesn't exist now and I can't do the requested timings. I forgot to note this, sorry.

On a 3.4Ghz i7 (turbo boosted to 3.7Ghz), and with solver calls for the final puzzle list disabled, these are the results:

Code: Select all: Inspecting grid no. 1 ... found 3476392 hitting sets (48.1887 seconds, average = 48.1887 sec/grid) Inspecting grid no. 2 ... found 407854 hitting sets (27.2066 seconds, average = 37.6976 sec/grid) 75.3953 seconds (3884246 puzzles)

With solver calls enabled, they change to this:

Code: Select all: Inspecting grid no. 1 ... found 3476392 hitting sets (96.8766 seconds, average = 96.8766 sec/grid) Inspecting grid no. 2 ... found 407854 hitting sets (32.8382 seconds, average = 64.8574 sec/grid) 129.715 seconds (3884246 puzzles)

This is with the code's original solver ... Brian Turner's "BBSolver".

BTW: The gentleman's name is Gary McGuire. See There is no 16-clue puzzle.

by **champagne** » Fri Nov 18, 2016 7:48 pm

Hi blue,

Thank you first to remind us the correct name of Gary McGuire.

Many thanks also for the reference. To be sure to read the results in the correct way, two questions

Is it the search of a 17 clues or is it still the search of a 16 clues?

Is it correct to say that the call {on or off} is to check whether the "solution" hitting all sets has a unique solution

The New Sudoku Players' Forum

Bands and low-clue puzzles

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles

17 search using bands

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles

Re: Bands and low-clue puzzles