## Generating large grids

For fans of Killer Sudoku, Samurai Sudoku and other variants

### Generating large grids

hkociemba wrote:Good progress concerning generation of a valid grid. Generation of 100x100 took 15 min though I did not try any optimization yet. For 225x255 though I do not think it will finish in a reasonable time yet.
The algorithm yet is extremely simple:
1. Fill all rows with numbers from 1..N in random order.
2. Select a random row r and in this row two random cells (r,c1) and (r,c2).
3. Count the number n1 of "wrong" numbers in the two columns c1 and c2 and in the two blocks b1 and b2 the (r,c1) and (r,c2) are in.
4. Swap the content of (r,c1) and (r,c2).
5. Count again. If the number n2 of "wrong" numbers is now greater, swap back.
6. Goto 2 and repeat until the grid is valid. Validity can be computed by the initial value I for the number of wrong numbers and always updating this value: I <-- I - n1 + n2.
Surprisingly the algorithm does not seem to get stuck in a local minimum (most of the time?) and this greedy algorithm works. For a standard 9x9 creation time is only about 30 ms on average, for a 49x49 20 s to 30 s.

m_b_metcalf wrote:Good to know that your algorithm works up to higher sizes. You prompted me to make some new timings on ancient code, the result being:
Code: Select all
` boxsize    number    time   average time----    ------    ----   ------------  3    1,000,000     28s      28µs  4      1,000     0.41s    0.41ms  5      1,000     6.28s    6.28ms  6         10    19.70s    1.97s    7          3    12.55s    4.18s   8          1    stallsOn an i5, with no output.`

For x-sudokus my method works only for 9x9 and 16x16. For larger sizes I offer a solver a pseudo puzzle, empty but with the diagonals already in place, and take the first solution it finds. That works for up to 36x36.

hkociemba wrote:For small blocksizes you algorithm works much faster than mine. For XSudoku mine does not seem to work at all. I have some idea how to manage this. I implemented a complete generator set of symmetry operations for non-XSudokus into my program now (permute_band, permute_stack, permute_bands, permute_stacks, reflect_diagonal,,relabel_nums). Only the last two also are symmetries of XSudoku too. So by applying the first 4 I will try to fix the XSudoku diagonals without destroying the Sudoku grid. But this will have to wait until next week. Btw., my program is written generally and works also for rectangular blocksizes. But I think XSudoku works best for quadradic blocksizes and I will restrict XSudoku to this kind of blocks.

I'm making a new thread for this topic.

Please note that all those permutations do not create what are regarded as essentially different grids. You can see more on this topic in this Mathematics thread, and more particularly in this paper. One of the authors, known as RedEd on that forum, had a way of determining biases in the grids. I submitted him a set of mine and it came off fairly well. Also, I submitted a large number of permutations of the Most Canonical grid and he discovered immediately that it was all the same basic grid (see here).

Regards,

Mike Metcalf

m_b_metcalf
2017 Supporter

Posts: 10762
Joined: 15 May 2006
Location: Berlin

### Re: Generating large grids

Hi, Mike!
I can propose a method of large solution grids generation. This method works well (200000 nonequivalent grids/second for grid size 225 x 225 on my notebook), but I am not sure - is this algorithm acceptable for you. The idea is simple. First, MC grid of reqired size is constructed. Then, new nonequivalent grids are produced from it by permuting Two-row (Two-column) UA sets. Rows (columns) for UA search are selected randomly.

Serg
Serg
2018 Supporter

Posts: 701
Joined: 01 June 2010
Location: Russia

### Re: Generating large grids

Serg wrote:Hi, Mike!
I can propose a method of large solution grids generation. This method works well (200000 nonequivalent grids/second for grid size 225 x 225 on my notebook), but I am not sure - is this algorithm acceptable for you. The idea is simple. First, MC grid of reqired size is constructed. Then, new nonequivalent grids are produced from it by permuting Two-row (Two-column) UA sets. Rows (columns) for UA search are selected randomly.

Serg, I'm not the arbiter of what's acceptable or not! I wrote an algorithm to generate MC grids of arbitrary size many moons ago. By making all the various permutations of rows/columns/bands etc. it's easy to hide the fact from a human observer. Your suggestion seems much better, because RedEd would find it less biased.

I can modify my usual algorithm to generate semi-random grids of size 64x64 and 100x100 very fast by simply making each even row a reverse copy of the preceding odd row. It doesn't work for larger sizes.

Regards,

Mik

m_b_metcalf
2017 Supporter

Posts: 10762
Joined: 15 May 2006
Location: Berlin

### Re: Generating large grids

Hi, Mike!
m_b_metcalf wrote:I can modify my usual algorithm to generate semi-random grids of size 64x64 and 100x100 very fast by simply making each even row a reverse copy of the preceding odd row. It doesn't work for larger sizes.

If I undestood you idea correctly, you want to decrease initial number of conflicts between random rows in your algorithm to speed it up. Right?

Some comments to my algorithm, proposed in previous post. When I wrote that I was not sure - is it acceptable, I mean that generated successive solution grids are very similar to each other (they differ by several cells only). It is drawback of the algorithm if generated solution grids will be used for manual puzzle construction. But this drawback can be compensated if generated solution grids will be randomized by random transposing, swapping bands/stacks, rows/columns. The key thing that all generated grids are nonequivalent.

Serg
Serg
2018 Supporter

Posts: 701
Joined: 01 June 2010
Location: Russia

### Re: Generating large grids

Hi, Mike!
Here are some timings of solution grids generation by my algorithm.
Code: Select all
`Box size   Grid size   Time of generation 1000000 essentially different solution grids     3x3         9x9     1 sec   15x15     225x225     5 sec   30x30     900x900    21 sec   60x60   3600x3600   120 sec`

Note: time of generation doesn't include time of results printing.

Serg
Serg
2018 Supporter

Posts: 701
Joined: 01 June 2010
Location: Russia

### Re: Generating large grids

Serg wrote:Here are some timings of solution grids generation by my algorithm.

Serg, Sorry for the delay in replying - I'm travelling at the moment. When I return home I may give your idea a try. In the meantime, let us not forget what RedEd wrote some years ago:

"> btw, I think the quest for fast unbiased grid generation is mainly
> academic: about the only use I can think of for such a perfectionist
> algorithm is in the estimation of the total number of proper puzzles. For
> other purposes, especially puzzle generation, anything with a scaled
> Z-score of about 100 (absolute value) or less seems absolutely fine. "

Regards,

Mike

m_b_metcalf
2017 Supporter

Posts: 10762
Joined: 15 May 2006
Location: Berlin

### Re: Generating large grids

Serg,
I've dug out some eight-year old code and added an unavoidable sets routine. I then implemented your method.
For a box size of 14 it's slower than yours, 118s for 1M puzzles. I have two remarks:

1) I don't think it's applicable for odd box sizes as there are no UA4s.

2) Even if I randomize the MC grid and apply all the swaps possible, the underlying structure is still clear once you try to use the result to produce a symmetric puzzle. Such puzzles are very, very easy to solve, requiring only singles and, furthermore, have a visible, linear structure, as shown by the 36x36 below.

Regards,

Mike

Code: Select all
`  . 18 16 29 12  .  . 17  .  3 21 34 25 26  . 24 36  .  .  1 28  . 23 13 35 33 27  .  8  .  .  6  9  2 20  .  .  4  .  .  . 19 27  . 35  .  .  8  .  .  .  . 14  1 36 30  .  .  .  .  5  .  . 21  .  3 11  .  .  . 29  .  . 27  .  . 35 22  . 20  .  .  6  2 18 12 16 11  .  .  .  .  5 31 34  3 26 30  .  . 10  . 13 14  .  .  1  . 21  . 34 17  .  .  .  .  .  .  .  .  4  .  2 19  . 20  7  . 35 27  . 22  .  .  .  .  .  .  .  . 26 10  . 25 14  . 23  1  .  . 25  .  . 24 36 10 31  5 34  . 21 17 32 29  . 18 16 11  9 20  4  .  . 19  .  . 35  8  . 27  . 25  .  . 26 24 15  1 28 13 14  . 27 35  . 22  .  .  .  .  9  .  2 19  . 29 18 32 16 11  3 21  .  . 17  .  .  .  . 22  8 33  . 13  .  1  . 28  .  .  . 30  . 24 31  . 34  .  .  . 16  . 32  . 12  . 20  4  2  .  .  . 18  . 12 11  . 29  .  .  2 20  .  9  7  8 35  .  . 22 15  .  . 14 28  1 10  . 36 25  .  . 17  . 34  5  . 21 31  .  5  .  . 17 36  . 10  . 25  . 14  . 28  . 15 13 27 22  .  7  . 33  . 19  .  4  . 20 29  .  . 12  . 32  .  . 28  . 23  1  7 22  8 33 27  .  6  .  . 20  4  .  . 11 16  .  . 29  .  3 21 31  5 17 30 25  . 26  .  .  .  6  9 19  2  . 32 11 16 29 18  . 21 34  5  .  .  .  .  .  . 36 26 30  . 13 14 15 28  1  . 27  8 35 22  . 25 36  . 24  .  .  .  3 34  . 31  .  . 16  . 29 18 11  4 19  2  .  9  .  . 22  . 27 35  .  .  . 23  . 13 14 34  3  .  . 31  . 12  . 11 18  . 32  .  . 20  9  2  .  .  8 22 35  .  . 15  . 13 23  . 28  . 30  .  . 10 26  . 13  .  . 15 28  . 10 24  . 30 36  3 31  .  5 34 21 29 16 11  . 32 18  4  6  .  2 20  . 27 33  .  .  8  .  .  . 30  . 25  .  .  . 13  .  1 14  . 27 33  .  .  .  .  .  .  9  6  . 18 32  . 16  .  .  . 17  . 21  .  .  2 19  .  6  .  . 35  8  .  . 33  .  . 15  . 28 23 14 30 10 24  . 36  .  . 21  .  . 17  5  .  . 11  . 16 12  8  . 33  . 27  .  9  .  .  4  .  . 11 18  .  . 16 32 17 34  .  . 21 31  .  . 24  .  . 26  .  1  . 14  . 28  . 11 29 32  . 12  5  .  . 31 17  .  . 25 30  . 10 36  1 23  . 28 14  .  .  7 22  .  . 35  4  . 19  6  2  .  . 33 22 27  .  8 23  .  . 14 13  .  . 36 24  . 26 25  3  5  . 34 31  .  . 18 29  .  . 16  6  . 20  4  9  . 26  . 24  . 36  . 34  .  . 21  .  . 29 32  .  . 12 18 19  9  .  .  4  6  .  . 33  .  .  8  . 13  . 15  . 23 12 29  . 18  .  .  2  9  .  . 19  .  .  7  .  8 35 27 13 28  1  . 15  .  . 25  .  . 24 10  .  . 17  .  5 34  .  .  3  . 21  .  .  . 30  . 24 25  . 14 13  .  .  .  .  .  .  8 27  .  6  4  .  9  .  .  . 11  . 18  .  .  .  1  .  . 14 23  . 35 33  . 22 27 20  6  .  2  9  4 11 12 29  . 18 32 21 31  .  5  3  . 36 24  .  . 26  .  9 20  .  .  6  . 16  . 29 32  . 18  .  .  3 34  5  .  . 26 30 10  .  . 14  .  1 28  . 23  . 22  .  . 35  8 30 26  . 10  .  .  . 14 15  . 23  .  . 22  . 27 33  8  2  6  4  . 20  .  . 16  . 29 32  .  .  . 31  . 21  3  .  5 21 34  3  . 11 32 18 12 16  .  9 19  6  .  .  .  .  .  . 22 33 35  . 23 28  1 14 15  . 10 25 30 36  .  .  .  6  . 19  4 22  7 27 35  8  . 28  .  . 15  1  .  . 36 25  .  . 26  . 34  5 17 21 31 12 16  . 29  .  . 33  .  7  .  . 27 19  .  4  .  2  . 12  . 32  . 29 16 34 21  .  3  .  5  . 10  . 30  . 25 28  .  .  1  . 13 29  . 32 16  . 18  .  . 31  5  . 17 26 24 36  .  . 10 23  .  . 13  1 28 22  . 35 33  .  .  9  .  4 20  . 19  .  .  . 23 13 15  . 36  . 26  . 30  .  .  . 31  . 34 16  . 18  .  .  . 19  .  9  .  6  . 35  8 27  .  .  .  . 16  .  . 29 32 20  4  6  2  9  .  8 33  .  7  .  .  .  . 14  . 13 23  . 26 10 24 25 36 34  5  .  . 31  . 19  .  4  9  .  . 29  .  . 16 12 11 34 17 31  .  3  5 26 25  . 30 24 10  1 28 23  .  . 14  .  .  7 22  . 33  3  . 31  5  .  .  .  .  .  .  .  . 23  . 15 14  . 28 35  .  7 33  .  8  .  .  .  .  .  .  .  . 32 11  . 29  .  8  .  . 33  7  . 15  .  . 28 13 10 30 25 36  .  .  .  . 21 17  3 34 29 12  .  . 18  .  2  9  .  .  4  .  . 10  .  .  . 36 17  . 21  .  .  3  .  .  .  . 11 12  9  4  .  .  .  . 33  .  . 22  .  7 23  .  .  . 15  .  . 23 15 28  1  .  . 27  .  8 35 22  2 20  .  6 19  .  . 18 32  . 11 16 17  5 34  . 31  .  . 26 36 24 25  .  No. fixed: 720 `

m_b_metcalf
2017 Supporter

Posts: 10762
Joined: 15 May 2006
Location: Berlin

### Re: Generating large grids

Hi, Mike!
m_b_metcalf wrote:1) I don't think it's applicable for odd box sizes as there are no UA4s.

It is known, that 9x9 MC grid has no U4 UA sets. I don't know - has U4 unavoidables MC grid of arbitrary size provided that box sizes are odd? But absence of U4 doesn't disturb me. I use more general class of unavoidable sets - 2-row unavoidable sets. This class of UA sets includes not only U4, but also some U6, etc. - up to U14. blue published excellent description of 2-row (2-column) unavoidable sets in the thread Investigation of one-crossing-free patterns. A row (column) may contain no 2-row UA set. (Such event is not so rare for 9x9 grids (measured probability - about 35%), but the larger grid's size, the smaller probability of 2-row UA absence for given row.) In such case we should choose another pair of rows (columns) for UA search.
m_b_metcalf wrote:2) Even if I randomize the MC grid and apply all the swaps possible, the underlying structure is still clear once you try to use the result to produce a symmetric puzzle. Such puzzles are very, very easy to solve, requiring only singles and, furthermore, have a visible, linear structure, as shown by the 36x36 below.

When I said about randomizing I meant random choice of permutations. It can be done in 9 steps (this example is applicable for 9x9 grids).
1. To transpose or not to transpose the grid? (One randomly choose 1 variant from 2.)
2. In what way to permute bands? (One randomly choose 1 permutation from 6.)
3. In what way to permute stacks? (One randomly choose 1 permutation from 6.)
4. In what way to permute rows in B123 band? (One randomly choose 1 permutation from 6.)
5. In what way to permute rows in B456 band? (One randomly choose 1 permutation from 6.)
6. In what way to permute rows in B789 band? (One randomly choose 1 permutation from 6.)
7. In what way to permute columns in B147 stack? (One randomly choose 1 permutation from 6.)
8. In what way to permute columns in B258 stack? (One randomly choose 1 permutation from 6.)
9. In what way to permute columns in B369 stack? (One randomly choose 1 permutation from 6.)

If one uses 36x36 grid (as in your example), one should choose possible bands permutation from 72 (6!) possible permutations, etc. It is not so simple task, I agree.

Serg
Serg
2018 Supporter

Posts: 701
Joined: 01 June 2010
Location: Russia

### Re: Generating large grids

Serg,
Well, I use only the basic UA4, so that's why we differ there. Also, I do apply all these random permutations as you can see in the example below. However, you can also see how mini-rows and mini-columns repeat within a band, in different orders, and the puzzle generation appears to be affected by that. Does your algorithm find any UAs in this grid?

Regards,

Mike

Code: Select all
`  8  2  3  9  1  7  5  6  4  6  5  4  3  2  8  1  7  9  7  1  9  4  5  6  2  8  3  5  9  7  6  4  2  3  1  8  2  4  6  8  3  1  9  5  7  1  3  8  7  9  5  4  2  6  3  6  2  1  8  9  7  4  5  4  7  5  2  6  3  8  9  1  9  8  1  5  7  4  6  3  2`

and a puzzle with only 20 givens, solvable with only singles:
Code: Select all
` . . . . . . . . . 6 5 4 . . . . . . . . . . . . 2 8 3 . . . 6 4 . 3 . . 2 . . . . . . . 7 . . . . 9 5 . . . 3 6 2 . . . . . . . . . . . 3 8 9 1 . . . . . . . . .`

m_b_metcalf
2017 Supporter

Posts: 10762
Joined: 15 May 2006
Location: Berlin

### Re: Generating large grids

Hi, Mike!
m_b_metcalf wrote:Serg,
Well, I use only the basic UA4, so that's why we differ there. Also, I do apply all these random permutations as you can see in the example below. However, you can also see how mini-rows and mini-columns repeat within a band, in different orders, and the puzzle generation appears to be affected by that.

Mike, I think we should longer generate grids to go far from MC grid. Say, 10^9 grids. Hope, they will be more random themselves.
m_b_metcalf wrote:Does your algorithm find any UAs in this grid?

Code: Select all
`  8  2  3  9  1  7  5  6  4  6  5  4  3  2  8  1  7  9  7  1  9  4  5  6  2  8  3  5  9  7  6  4  2  3  1  8  2  4  6  8  3  1  9  5  7  1  3  8  7  9  5  4  2  6  3  6  2  1  8  9  7  4  5  4  7  5  2  6  3  8  9  1  9  8  1  5  7  4  6  3  2`

It's easy to find U6 in r1/r2 rows.
I start from r1c1 cell. 8 --> 6, 6 --> 7, 7 --> 8. Stop! We found U6, containing cells r1c1, r2c1, r1c8, r2c8, r1c6, r2c6.

Serg
Serg
2018 Supporter

Posts: 701
Joined: 01 June 2010
Location: Russia

### Re: Generating large grids

Serg wrote:It's easy to find U6 in r1/r2 rows.
I start from r1c1 cell. 8 --> 6, 6 --> 7, 7 --> 8. Stop! We found U6, containing cells r1c1, r2c1, r1c8, r2c8, r1c6, r2c6.

Ah, yes, it's replete with such U6s. If I make such a substitution just once in each band, we get sufficiently far from MC that the puzzle generator immediately produces much more interesting puzzles, such as the one below (now posted on the Puzzles thread). However, since my U6 code is written, foolishly, for 9x9 only, and I don't intend to upgrade it, I'll leave it at that for the moment, and wait for hkociemba to show up again.

Regards,

Mike
Code: Select all
` . . . 9 . . 5 7 . . . . . . . . . . . 1 9 4 . 6 . . . . 4 . . . 2 9 . . 2 . 6 8 . 1 3 . 7 . . . 7 . . . . . . . . 2 . 9 7 4 . . . . . . . . . . . 8 1 . . 4 . . .`

m_b_metcalf
2017 Supporter

Posts: 10762
Joined: 15 May 2006
Location: Berlin

### Re: Generating large grids

Hi, Mike!
I generated a bunch of 9x9 sudoku solution grids and propose you to try 100000-th and 1000000-th grids as source for puzzles. They looks random. (At each step there was swapped 1 UA only.)
Code: Select all
`Grid number 100000     Grid number 10000008 1 7 2 6 4 5 9 3      6 1 5 2 8 4 3 9 75 6 2 9 3 1 4 7 8      3 4 9 7 1 5 2 6 89 3 4 5 8 7 2 6 1      7 8 2 6 3 9 5 4 11 2 3 7 4 9 8 5 6      1 9 6 3 2 7 8 5 47 9 8 1 5 6 3 2 4      5 7 4 9 6 8 1 3 26 4 5 3 2 8 9 1 7      8 2 3 5 4 1 6 7 93 7 1 4 9 5 6 8 2      2 3 1 4 7 6 9 8 52 5 6 8 7 3 1 4 9      4 5 8 1 9 3 7 2 64 8 9 6 1 2 7 3 5      9 6 7 8 5 2 4 1 3`

The same grids in the line form:
817264593562931478934587261123749856798156324645328917371495682256873149489612735
615284397349715268782639541196327854574968132823541679231476985458193726967852413

I have code for generation solutions grids of arbitrary size. Feel free to tell me about grids generation or code sharing.

Serg
Serg
2018 Supporter

Posts: 701
Joined: 01 June 2010
Location: Russia

### Re: Generating large grids

Serg,
I used the first grid and the bulk of the puzzles based on that are easy. An exception is shown below. If you look at the grid, it's possible to see repeating patterns (some shown), and the puzzle generator seems to 'feel' their presence.

HTH

Mike

P.S. I don't think you could make much use of my code, nor I of yours: I work exclusively in modern Fortran. Thanks anyway.
Code: Select all
` . . . . . . . . . . 6 2 . . 1 . . 8 . . 4 5 8 . 2 . 1 . . . . . 9 . 5 . . . 8 1 5 6 . . . . 4 . 3 . . . . . 3 . 1 . 9 5 . . . 2 . . 8 . . 1 4 . . . . . . . . . . 8 1 7 2 6 4 5 9 3    2,6 3,9 7,8 5 6 2 9 3 1 4 7 8       ditto 9 3 4 5 8 7 2 6 1       ditto 1 2 3 7 4 9 8 5 6    etc. 7 9 8 1 5 6 3 2 4 6 4 5 3 2 8 9 1 7 3 7 1 4 9 5 6 8 2 2 5 6 8 7 3 1 4 9 4 8 9 6 1 2 7 3 5`

m_b_metcalf
2017 Supporter

Posts: 10762
Joined: 15 May 2006
Location: Berlin

### Re: Generating large grids

Hi, Mike!
I have some curious observation about repeating digit pairs described in your post.
m_b_metcalf wrote:... If you look at the grid, it's possible to see repeating patterns (some shown), and the puzzle generator seems to 'feel' their presence.
Code: Select all
` 8 1 7 2 6 4 5 9 3    2,6 3,9 7,8 5 6 2 9 3 1 4 7 8       ditto 9 3 4 5 8 7 2 6 1       ditto 1 2 3 7 4 9 8 5 6    etc. 7 9 8 1 5 6 3 2 4 6 4 5 3 2 8 9 1 7 3 7 1 4 9 5 6 8 2 2 5 6 8 7 3 1 4 9 4 8 9 6 1 2 7 3 5`

It's very curious (for me, at least), that every band (stack) in a 9x9 sudoku solution grid may be of 2 possible types only.

Type 1 ("pure matching" case)
Every minirow in a band (minicolumn in a stack) has two "brother" minirows in the same band, containing exactly the same set of digits. You can see an example in MC grid.

Type 2 ("half matching" case)
Every minirow in a band (minicolumn in a stack) has two "brother" minirows in the same band, containing the same pair of digits. All 9 minirows are separated into 3 sets of mutually "brother" minirows, connected by some pair of digits. Such minirows splitting into "brother" sets can be done by unique way. It can be easily proved mathematically. So, in this case 3 pairs of digits shares 3 minirows (each pair shares its own minirow set).

So, your observation above, that pairs {2,6}, {3,9} and {7,8} are repeated 3 times each in minirows of the band B123 implies that this band is just of very common ("half matching") type. It's absolutely normal and doesn't imply insufficiently random digit placement in a grid. To my observation, absolute majority of possible bands and stacks of filled solution grids are of type 2.

Other bands and stacks of this grid are of type 2 ("half matching") too. So overall grid is of "half matching" type. This is very common case.

There are filled grids with all bands and stacks of type 1 ("pure matching"), for example MC grid. In this case we can call such grid "pure matching". There are also grids of "mixed" type, with some bands of type 1 and other bands of type 2.

Maybe I am writing about well-known solution grids' property, but it is new for me.

It's interesting - how many are there essentially different solution grids of "pure matching" type (MC grid variations)?

Serg

[Edited. I corrected error statements and refined definitions.]
[Edited 2. I corrected typos and refined statements.]
Last edited by Serg on Tue Oct 10, 2017 9:22 am, edited 1 time in total.
Serg
2018 Supporter

Posts: 701
Joined: 01 June 2010
Location: Russia

### Re: Generating large grids

Serg,

Very astute points. Does this suggest another method of grid generation, by suitable generation of pairs?

Regards,

Mike

m_b_metcalf
2017 Supporter

Posts: 10762
Joined: 15 May 2006
Location: Berlin

Next