## Composing a Solution Grid

Programs which generate, solve, and analyze Sudoku puzzles

### Composing a Solution Grid

A short time ago, someone posted about his method of generating new puzzles, but I can no longer find it. Anyway, it aroused my curiosity regarding a quick method to produce a new solution grid. What follows is a result of spasmodic manual experiments on a spreadsheet that I have made as other commitments have permitted.

As assignments are made, the system depends on two basic routines to make the resultant eliminations: checking for naked and hidden tuples and box/line reductions.
This is the sequence of operations:
1. Box1 is randomly seeded with a full set of digits
2. In each mini-row and column, one digit is randomly selected to repeat in the opposite diagonal direction to the other two.
__ This braiding analysis approach doesn't result in any assignments, but it does reduce the mini-rows in tier1 and the mini-columns in stack1 to triples.
3. Box9 is randomly seeded with full set of digits and this is repeated if necessary until no empty cells result in boxes 3 or 7.
__ An empty cell will be caused by a chance matching of a triple in b3 or b7 with a triple in the opposite direction in b9.
4. Two digits are randomly set in diagonal cells in box5 and then in the six remaining boxes.
5. The cells in box 5 are now filled by randomly selecting candidates, prioritising the cells that hold the fewest.
6. The great majority of cells will now be reduced to singles, but for those that aren't, the final candidates to assign are selected as for box 5.
Code: Select all
` *-------------------------*-------------------------*-------------------------*                         | 2       7       9       | 36      36      8       | 15      15      4       |                         | 1       6       3       | 45      45      9       | 7       8       2       |                         | 8       4       5       | 127     127     127     | 3       6       9       |                         *-------------------------*-------------------------*-------------------------*                         | 346     9       127     | 8       1234567 1234567 | 1245    157     36      |                         | 346     8       127     | 1234567 9       1234567 | 1245    157     36      |                         | 346     5       127     | 123467  123467  123467  | 124     9       8       |                         *-------------------------*-------------------------*-------------------------*                         | 9       12      46      | 12467   12467   12467   | 8       3       5       |                         | 7       23      8       | 9       235     235     | 6       4       1       |                         | 5       13      46      | 1346    8       1346    | 9       2       7       |                         *-------------------------*-------------------------*-------------------------*`

This example grid shows the state of play at the end of step 4.
It shows that in step 3, if r7c7 had held (3) rather than (8), then r3c7 would have been empty as (369)r3c789 would have been matched by (369)r789c7.

This sequence aims to get the fastest descent by keeping the focus narrow at every stage, and by restricting the available choices as quickly as possible. Clearly naked and hidden singles must be assigned as soon as they are revealed, but after the eliminations resulting from pairs are made, trials have shown that they can be safely ignored until after box 5 has been assigned. In combination, these assignments produce a larger number of follow-on eliminations, which will convert most of these ignored pairs into singles in the process.

From limited manual trials, it appears that only step 3 can cause an invalid grid to be produced, and this can be quickly checked and corrected when necessary. However, this needs to be tested using a much bigger sample size. Depending on circumstances, I may code this approach into a worksheet function, but I would like to hear any comments from more experienced programmers first.

DPB
.
David P Bird
2010 Supporter

Posts: 1043
Joined: 16 September 2008
Location: Middle England

### Re: Composing a Solution Grid

Hi David P Bird,

David P Bird wrote:A short time ago, someone posted about his method of generating new puzzles, but I can no longer find it.

I think you are referring to this post that was nicely documented by StrmCkr.

R. Jamil
rjamil

Posts: 683
Joined: 15 October 2014
Location: Karachi, Pakistan

### Re: Composing a Solution Grid

rjamil wrote:I think you are referring to this post that was nicely documented by StrmCkr.]

Hi RJ,
Thanks! Yes, that was the one I was thinking of, but I thought the discussion was much more recent. I guess I must have read it when I was doing some browsing!

Thanks a lot for the link.

David
.
David P Bird
2010 Supporter

Posts: 1043
Joined: 16 September 2008
Location: Middle England

### Re: Composing a Solution Grid

if you are referring to a thread i deleted, as i am rewriting it.

a different way to generate all grids.

Code: Select all
`these are the fixed starting digits.123|456|78945.|...|......|...|...-----------2..|...|......|...|......|...|...-----------...|...|......|...|......|...|...`

Code: Select all
`123|456|78945A|EEE|DDDF1FF|C1CC2|BBB-----------2 ..|I  K J |LLLG2..|I1 K J1|...G1..|I2 K J2|MMM-----------H..|...|...H..|...|...H..|...|...`

A = 1 of [6,7,8,9]: {skip as solving BC forces A as 1 digit}
B = [4,5] + ([A] * [6]) or [4,5] + 1 of [1,2,3] when [A]*[6] = []
C = 3 of ([1,2,3,7,8,9] – [B]) where C * ([ 7,8,9] – [A]) = [], and [C] * [1,2,3] => [1,2,3] – [B]
{Sum of (BC) = 360 permutations.}

D = [1,2,3,4,5,6] – [B] ->> (3*2*1) {skip D as it is reduced to 1 permutation set of 3 digits}

E = [ 1,2,3,7,8,9] – [C] ->> (3*2*1) {skip E as it is reduced to 1 permutation set of 3 digits}
Note: E is reduced to 2 permutations by I,J

F = [ 6,7,8,9] – [A + B + C] ->>( 2 * 1) {F1 is solved by GH, BC, and [1,2,4] F is a permutation of 2 digits}

G = 2 of [3,5,6,7,8,9] - [F1]->>(2*1): Note combination set listed in sequential order

H = 3 of [3,5,6,7,8,9] – [F1 +G] ->>(3*2*1): Note combination set listed in sequential order

{Sum of (BCGH) ->> 1.08 x 10^4 essentially different placements.
1.08 x 10^4 * (6*6*2) * 1 * 1 ->> 77,600 different Band 1 and Col 1 solutions.

I = [1,2,3,5,6,7,8,9] – [C1] where I <> [E], and I can contain 0,1,2 #’s of [E]
{Sum of (BCGHI) ->>{3.672 x 10^5}

J = [1,2,3,4,5,7,8,9] – [I] – [C2] where J <> [E] And [1,2,3,4,5,6,7,8,9] – [I+J] <> [E]
{Sum of (BCGHIJ) ->>{6.220872 * 10^6}

K = [1,2,3,4,5,6,7,8,9] – [G,I, J] {Skip K as it is reduced to 1 permutation set of 3 digits}
6.220872*10^6 * (6*6*2*2)*1*1 =895,805,568 {so far}

L = [1,3,4,5,6,7,8,9] – [I,J] where [L] <>[ K] and [L] = (([I1+J1] * [I2+J2]) – [I+J] ), and L = [G]- ([1..9] –[L+J])
{Sum of (BCGHIJL) ->>{1.15470036 x10^8}

M = [1,2,3,4,5,6,7,8,9] – [G1,L,I2,J2] where [M] <> [K], and [L] =( [G2]+[i1]+[j1] -[G1,L,I2,J2])

I have the last band also coded probably not as optimized {so it is left out atm} as the first 2 bands,
i have done the mathematical calculation for increasing grid count per step down to L step this alone took me 1 days to cycle in full {on a ryzen 1800+ clocked at 4ghz via single thread coded in free-pascal}

In total I'll have 14 steps to generate all possible unique single solution grids.

{which is a lot when u could do it with 9 steps of 1 template per step.}
Last edited by StrmCkr on Wed May 30, 2018 4:55 am, edited 1 time in total.
Some do, some teach, the rest look it up.
stormdoku

StrmCkr

Posts: 1318
Joined: 05 September 2006

### Re: Composing a Solution Grid

Hi StrmCkr,

Is yours "a different way to generate all grids" better than braid analysis?

Added:

Allow me to conclude as both procedures uses braiding at some point.
Also, in my this post, the technique to reduced the chance of backtracking probably is braiding. (As discussed with StrmCkr on several pm in short time.)

R. Jamil
rjamil

Posts: 683
Joined: 15 October 2014
Location: Karachi, Pakistan

### Re: Composing a Solution Grid

Hi StrmCkr,
StrmCkr wrote:In total I'll have 14 steps to generate all possible unique single solution grids.

That is the obvious difference between our individual aims! My approach can't possibly find every valid grid as it will miss any with repeating triples in every box, because I force the repeating pairs & singles in stack1 and tier1.

I did this because puzzles rich in repeating triples are not so interesting, and my aim was to find an alternative to John Musgrave's approach. If you like, you are being mathematically much purer than I am, because you are allowing for every possibility, and I'm not.

From the description on John Musgrave's website, most of his attempts to produce a solution grid fail. When this happens, he simply makes further attempts until he succeeds. He found that this was faster than incorporating any checks into his random generating code. This then became the challenge for me – could I devise a fast routine for producing interesting grids where the resultant grid would always be valid?

StrmCkr wrote:In total I'll have 14 steps ... {which is a lot when u could do it with 9 steps of 1 template per step.}

I made much the same observation, which is why step 4 in my sequence is equivalent to picking templates for two of the digits. After that, the forced singles quickly start to roll in and start to dominate the process.

R Jamil, to cover your point about the use of braiding analysis concepts:

Essentially I was trying to fill boxes 1,5, & 9 with compatible digit distributions and then using a pair of braiding patterns to provide additional constraints to force the other 6 boxes. Now, the braid patterns in stack 1 and tier 1 and the digit distribution in box 9 must be compatible. By trial and error, I determined it was easier to set the braid pattern and simply accept or reject a shuffled digit distribution. Usually the first random distribution tried will be compatible.

The outcome from this approach is that by the time step 6 is reached, three boxes have been randomly filled (27 cells) and two digits have been completely assigned randomly (12 more cells). The only cells that now need to be tidied up are those in any deadly patterns that still exist. Most of the work has been done by the checking for singles. However, on occasions pairs and line/box inferences may also be found. The system consequently relies on efficient scanning procedures for these and Jason Lion has made some interesting comments about this < here >

Finally, please remember that my experience with this system is limited to manual trials on a spreadsheet. There may still be times an invalid solution grid might be produced from some deep-seated compatibility issue.

David
.
David P Bird
2010 Supporter

Posts: 1043
Joined: 16 September 2008
Location: Middle England

### Re: Composing a Solution Grid

Hi David P Bird,

David P Bird wrote:Finally, please remember that my experience with this system is limited to manual trials on a spreadsheet. There may still be times an invalid solution grid might be produced from some deep-seated compatibility issue.

I don't think so. If you are started with empty Sudoku grid, every possibility/clue/pencilmark will generate full grid. As far as braid analysis eliminations are concerned, they will only remove those contradictory clues earlier that will be eliminated one by one if grid is filling sequentially.

But, this is my strong feeling and may be I am wrong.

R. Jamil
rjamil

Posts: 683
Joined: 15 October 2014
Location: Karachi, Pakistan

### Re: Composing a Solution Grid

Progress Report

Working in limited spells over the last few weeks, I now have a custom Excel function NewSolnGrid(ID,SerialNo) based on the approach described in the opening post. In Excel, without the benefit of turning screen refreshing off, it produces 50 new grids in about 2 seconds, quite adequate for my purposes. Coded in C, and run as a batch file, it would run much faster.

As box/line inferences were only rarely found, grid checking has been limited to looking for naked and hidden singles and doubles. To identify the pairs, when a cell is reduced to a bivalue, a one-off search is made for a peer cell holding the same digits for a naked pair. For hidden pairs this translates to when a digit is reduced to being bilocal in a house, making a one-off search for a companion digit that is restricted to the same two cells.

Invalid grids result when a cell is made empty or is required to hold two clashing digits. The empty cell condition is checked when every elimination is made, and the digit clashes are quickly identified when the same cell is reserved twice in a first-in, first-out queue of forced assignments. In either case I have I have followed John Musgrave's lead – it is simpler and quicker to start again than to try to backtrack.

Following the exchange with StrmChkr, I also came to consider that when digits are selected to occupy box 5, it would be quicker to go on and assign their full templates. This only requires that the digit to be also assigned in boxes 6 and 8, as the braiding patterns will then force the rest of the template. The benefit of this approach is that it quickly limits the scope of the subsequent checks to be made, but I have yet to try to take full advantage of that.

rjamil, my earlier conjecture that a 100% success rate might be possible has proved to be far too optimistic even before I took box/line inferences out of the checks to be made. Checking some failures that were produced, I found that it required quite long AICs to prove that a randomly chosen assignment was false. It seems that all the new randomising sequence does is to reduce the chance of a wrong random selection being made, not eliminate it. Trials have shown a success rate of 89% for producing a valid grid at the first attempt.

David PB
David P Bird
2010 Supporter

Posts: 1043
Joined: 16 September 2008
Location: Middle England

### Re: Composing a Solution Grid

Post Script
I have now developed a sister spreadsheet function for a brute force solver. It's nothing remarkable, but does have a feature that is perhaps worth mentioning.

For puzzles with multiple solutions, the solver continues searching for up to a total of 20 solutions. As these are found, it builds a PM grid showing the digits that can occupy the variable cells. In the 81-character result string, the fixed cells are then shown as digits and the variable ones are shown as dots. When run for a single puzzle, the PM grid is retained in memory and is available for display.

For puzzles with far more than 20 solutions some cells may be wrongly reported as fixed when they aren't, but I can live with that. By adding extra clues for the dot cells to reduce the solution count, these cells will soon be revealed.

I now have various half-baked ideas of what use I can make of the composing and solving tools ... I may be gone for some time.

David PB
David P Bird
2010 Supporter

Posts: 1043
Joined: 16 September 2008
Location: Middle England

### Re: Composing a Solution Grid

Hi David,

You can make the baking funny by trying your sister spreadsheet function with this valid puzzle
Code: Select all
`1...2.3....41....5.6...7.8..4..1.2..2..7....4..3....7.4..8..5....8..6....9..7...8`

from here and see how the dots vary when removing each one of the givens.
Hidden Text: Show
Answer: All non-givens must be dots after each single given removal.
dobrichev
2016 Supporter

Posts: 1835
Joined: 24 May 2010

### Re: Composing a Solution Grid

Hi Mladen, thanks for your response and I'm pleased that my use of common English parlance keeps you amused!

As for your puzzle, I see what you mean. I took one given out and my solver found 20 solutions where 6 of the cells had the same digit so were misreported as solved. That was not such a surprise. However, where the real surprise was that for the unsolved cells 22 out of 55 failed to contain the true digit as a candidate!

Progressively adding digits for dot cells I eventually ended up with this solution grid, but it wasn't quick.
182659347734128695569437182847913256216785934953264871471892563328546719695371428

So just how common are puzzles that meet the criterion "every given constrains all non-givens"? Furthermore, how can this property be tested?

I conclude therefore that, as a general purpose solver, my latest effort should have this feature removed. Nevertheless, I'm hopeful that as a means of checking if a composed solution grid can be solved using a trial set of clues (which was behind my reasoning for adding it) it still could be helpful.

Thanks a lot for your input. On such matters, I'm still very much a hammer chewer. (You'll have to say that out loud to have a chance of understanding it.)

David
David P Bird
2010 Supporter

Posts: 1043
Joined: 16 September 2008
Location: Middle England

### Re: Composing a Solution Grid

Hi David,

I posted this puzzle here to show you that I have something similar in mind but realized it only for extreme distributions.
Not to test your solver, where you can reach correct results only by sufficiently increasing the 20 solutions.

So just how common are puzzles that meet the criterion "every given constrains all non-givens"?

Maybe Eleven can tell you more. To my knowledge nobody tested random puzzles for this criterion.
One biased number that is still better than nothing, from 95,147,836 relatively hard puzzles sharing the same pattern (Patterns game #315) 1825 meet this criterion which is ~1:50,000.

Furthermore, how can this property be tested?

One method is yours but with sufficiently large number of solutions.
My method is focused only to this extreme and is computationally cheap (95,147,836 puzzles tested for 34,000 seconds = 10 million puzzles per hour).

Code: Select all
`for each g in givens {  take the original puzzle and its only solution  remove g from the givens  for each n in non-givens {    exclude from pencilmarks of n the value from the solution    solve up to the first solution    if no solution then return failure  }}return success`

I conclude therefore that, as a general purpose solver, my latest effort should have this feature removed.

I strongly disagree.
The extremes are probably beyond the interest of the solving using human-acceptable methods, even with computer assistance. This was demonstrated by the number of replies in the Puzzles section.
The common puzzles could be evaluated by your method and interactively playing with such a tool could suggest ways to achieve complex and interesting puzzle structures. Hopefully extremely interesting and not extremely complex.

Now you have a reserved term - the SSF method.
dobrichev
2016 Supporter

Posts: 1835
Joined: 24 May 2010

### Re: Composing a Solution Grid

David P Bird wrote:A short time ago, someone posted about his method of generating new puzzles, but I can no longer find it

Another common method of generating puzzles is based on a brute force solver, at each step the proposed puzzle is feed to the solver and if a solution is found the puzzle generation process is complete. The basic method is to initially place N digits at random (N selected for max speed, typically around 5), then at each subsequent step a single random digit is placed in a random cell and the proposed puzzle is again fed to the solver. After M failed attempts at adding random digits (again M tuned for generation speed) you give up and start over from scratch. I'm not sure this interests you, but it is an interesting contrast to the discussion to this point.

JasonLion
2017 Supporter

Posts: 642
Joined: 25 October 2007
Location: Silver Spring, MD, USA

### Re: Composing a Solution Grid

Hi Jason, thanks for your description of an alternative approach which sounds quite viable.

Clearly I'm a late arrival to this party and at the moment am concerned with getting a better understanding of the subject. I am still very uncertain where this will lead. When I started I had neither a grid composer nor a solver, and it seemed easier to start with the composer.

My current composer takes about 0.04 seconds per solution grid while my solver takes about 0.08 seconds per puzzle so at those speeds your alternative method would take longer. I've got no idea how much that ratio would change if they were translated from Visual Basic for Applications into C++. (As my previous programming experience was many years ago, I'm rusty and out of date now, and must rate myself at not much better than a novice.)

David PB
.
David P Bird
2010 Supporter

Posts: 1043
Joined: 16 September 2008
Location: Middle England

### Re: Composing a Solution Grid

An easy way to randomly generate a solution grid with same probability for each grid is used in my solver/generator I recently posted in the Software section.

The algorithm is extremely simple:
1. Fill all rows with numbers from 1..9. in random order.
2. Select a random row r and in this row two random cells (r,c1) and (r,c2).
3. Count the number n1 of "wrong" numbers in the two columns c1 and c2 and in the two blocks b1 and b2 the (r,c1) and (r,c2) are in.
4. Swap the content of (r,c1) and (r,c2).
5. Count again. If the number n2 of "wrong" numbers is now greater, swap back.
6. Goto 2 and repeat until the grid is valid. Validity can be computed by the initial value I for the number of wrong numbers and always updating this value: I <-- I - n1 + n2.
Surprisingly the algorithm does not seem to get stuck in a local minimum. Takes about 30 ms to generate a grid.

To generate a puzzle from this grid I randomly remove clues and test if the reamaining puzzle still is solvable. Solvability is sufficient if the solving procedure uses only basic methods like hidden/naked singels, block-line interaction etc. With the SAT-solver as method included I also have to test for uniqueness of the solution.

hkociemba1

Posts: 60
Joined: 08 August 2018

Next

Return to Software