## SudokuP - Frequency Analysis

Everything about Sudoku that doesn't fit in one of the other sections

### SudokuP - Frequency Analysis

A SudokuP (aka P-Sudoku) is a normal Sudoku with an additional requirement, that each of the 9 P-sets has a different value in each cell.

A P-set is a set of 9 cells with the same relative Position within a 3x3 block. For example, this grid is SudokuP (the * cells indicate the P-set corresponding to the first position in each block):

Code: Select all
` *3 9 6 | *1 7 4 | *5 8 2  1 7 4 |  5 2 8 |  3 6 9  5 2 8 |  3 9 6 |  1 4 7 ------------------------ *9 6 3 | *7 4 1 | *2 5 8  4 1 7 |  8 5 2 |  6 9 3  8 5 2 |  6 3 9 |  4 7 1 ------------------------ *6 3 9 | *4 1 7 | *8 2 5  7 4 1 |  2 8 5 |  9 3 6  2 8 5 |  9 6 3 |  7 1 4`

Random Sudoku grid tests suggest that roughly 1 in 30,000 Sudoku grids have the P property.

But explicit enumeration of several sets of complete Sudoku solutions (with row 1 and column 1 fixed) shows a much higher frequency, more like 1 in 15! That's 3 orders of magnitude difference!

Code: Select all
` 1 2 3 4 5 6 7 8 9 4 . . . . . . . . 5 . . . . . . . . 2 . . . . . . . . 3 . . . . . . . . 7 . . . . . . . . 6 . . . . . . . . 8 . . . . . . . . 9 . . . . . . . .Mode = SudokuPSudP solutions =  50,861,142, solve time =    146sMode = SudokuSud  solutions = 744,603,884, SudP =  50,861,142, solve time = 2748s`

The solver is a standard DLX solver. In normal Sudoku mode it checks each solution and reports the number which have property P.

In SudokuP mode it uses DLX with the extra constraints to arrive at the same number of SudP solutions, but 20 times faster.

I have written the SudP solutions to a file and double-checked them, they all appear to be valid (and different).

The grid template is only biased towards SudokuP by virtue of the fact that R4C1 isn't 4, and R7C1 isn't 7, which hardly goes any way to explaining the anomaly.

WTF??
Last edited by Mathimagics on Mon Jan 08, 2018 3:22 pm, edited 3 times in total.

Mathimagics
2017 Supporter

Posts: 447
Joined: 27 May 2015

### Re: SudokuP - Frequency Anomaly

Here is the result of another enumeration, this time with Band 1 fixed:

Code: Select all
` 1 2 3 4 5 6 7 8 9 4 5 6 7 8 9 1 2 3 7 8 9 1 2 3 4 5 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Mode = Std Sudoku05:04:34: Start Solve10:16:55: Solutions = 7,802,998,272 (SudP = 631,369,728)Solve time =  18742.761sMode = SudokuP04:31:27: Start Solve04:55:13: Solutions = 631,369,728Solve time =   1427.169s`

The total number of standard Sudoku solutions (divided by 72) matches the Felgenhauer/Jarvis figure. Again the SudP count is better than 1 in 15.

Mathimagics
2017 Supporter

Posts: 447
Joined: 27 May 2015

### Anomaly Explained?

Eureka!

I eventually unravelled the mysteries of the Felgenhauer & Jarvis enumeration system and made an astounding discovery - like any good rock album, it's all about the BAND!!

Code: Select all
` 1 2 3 | 4 5 6 | 7 8 9 4 5 6 | 7 8 9 | 1 2 3 7 8 9 | 1 2 3 | 4 5 6 Grids = 7,802,998,272    SudP = 631,369,728   ratio =       12.36`

Compare that ratio of normal grids to SudokuP grids with this slightly different band 1 setting:
Code: Select all
`  1 2 3 | 4 5 6 | 7 8 9 4 5 6 | 8 9 7 | 3 1 2 7 8 9 | 3 1 2 | 5 6 4 Grids = 6,831,977,472    SudP =      13,764   ratio =   496365.70`

That's a staggering difference!

Since the number of standard Sudoku grids is known for each of the 36,288 first band settings, and direct enumeration of SudokuP grids for a band can be done very efficiently with modified DLX (eg: the second band above only took 0.05s to enumerate), I should be able to come up with an exact value for the number of SudokuP grids within a reasonable time frame.

Mathimagics
2017 Supporter

Posts: 447
Joined: 27 May 2015

### Re: SudokuP - Frequency Analysis

I wrote:I should be able to come up with an exact value for the number of SudokuP grids within a reasonable time frame

Well, using 3 parallel processes, I was able to obtain exact counts for each of 32,688 band representatives in a couple of hours. But each band representative actually corresponds to 72 different bands - for standard Sudoku these all have the same grid count so we just multiple by 72 to get the full grid count.

But for SudokuP this equivalence doesn't apply, so for an exact count we'd need to do all 72 * 32,688 = 2,612,736 band settings. That's feasible but would take a few days.

Using the 32,688 band results as a sample in any case does show a SudokuP frequency of 1 in 28,000, which agrees reasonably well with observed random grid results (1 in 33,000).

So I'm declaring this job effectively done ... until such time as we learn more about why the choice of fixed band has such wildly different outcomes.

Mathimagics
2017 Supporter

Posts: 447
Joined: 27 May 2015

### Re: SudokuP - Frequency Analysis

You may try to compose grids by assembling 46656 possible digit templates after removal those having 2+ cells in the same box coordinates.
From the symmetries, all except the independent row/col swapping within a band/stack remain valid. This involves a factor of 36x36.
dobrichev
2016 Supporter

Posts: 1364
Joined: 24 May 2010

### Re: SudokuP - Frequency Analysis

dobrichev wrote:From the symmetries, all except the independent row/col swapping within a band/stack remain valid. This involves a factor of 36x36.

Yes, agreed and understood. But isn't that a factor of 6^6 = 46656?

dobrichev wrote:You may try to compose grids by assembling 46656 possible digit templates after removal those having 2+ cells in the same box coordinates.

Not sure what you mean, can you elaborate?

Are you suggesting an alternative method of enumeration, or a way to establish band equivalences?

Mathimagics
2017 Supporter

Posts: 447
Joined: 27 May 2015

### Re: SudokuP - Frequency Analysis

A valid SudokuP grid is a union of 9 mutually disjoint "valid" single-digit templates.
Code: Select all
`........1 invalid template: r1c9 collides with r7c3...1......1............1..........1.1..........1............1......1.......1..... valid template (from the opening post)1..............1.......1....1...............1....1......1.............1.`

All templates count to 46656 and "valid" templates are a subset of them.

If we start from a grid in canonical form and want to test it for P-property, then we must expand it in 36x36 ways, namely
- swap r4,r5,r6 = 6 ways
- swap r8,r8,r9 = 6 ways
- swap c4,c5,c6 = 6 ways
- swap c7,c8,c9 = 6 ways
The rest transformations are preserving the P-property. (unless I am wrong, am I?)
dobrichev
2016 Supporter

Posts: 1364
Joined: 24 May 2010

### Re: SudokuP - Frequency Analysis

Ok,I think I have it.

A "digit template" is a transversal, ie a set of 9 cells (R, C) that hits every row, column and 3x3 block.

For SudokuP a valid template must also hit every cell position, and a SudokuP solution is a set of 9 disjoint valid templates.

dobrichev wrote:If we start from a grid in canonical form and want to test it for P-property, then we must expand it in 36x36 ways, namely
- swap r4,r5,r6 = 6 ways
- swap r8,r8,r9 = 6 ways
- swap c4,c5,c6 = 6 ways
- swap c7,c8,c9 = 6 ways

The rest transformations are preserving the P-property. (unless I am wrong, am I?)

That's quite right!

This is a solver method I have used before for regular Sudoku - I call it the transversal method.

I will need to explore this as an alternative SudokuP solving method. Thanks!

Mathimagics
2017 Supporter

Posts: 447
Joined: 27 May 2015

### Re: SudokuP - Frequency Analysis

For blank grid there appear to be 8784 valid templates.

Mathimagics
2017 Supporter

Posts: 447
Joined: 27 May 2015

### Re: SudokuP - Frequency Analysis

single digit template list in full there is
46656
different ways to fill in 1 digit on a blank grid.

0..80 cell listing

For blank grid there appear to be 8784 valid templates

this would be doing a Union between 9 different templates, so they all hold 1 Digit per Row/Col intersection {3x3} pattern, where the 9 digits do not repeat in these intersections.
Attachments
46656 templates.rar
46656 - template patters for 1 digit.
Some do, some teach, the rest look it up.

StrmCkr

Posts: 732
Joined: 05 September 2006

### Re: SudokuP - Frequency Analysis

It seems that the exact number of SudokuP (aka: Sudoku 4D, etc etc) grids is known, and was provided by Red Ed back in 2005 (see here).

The number given is 201,105,135,151,764,480 which makes the exact proportion of SudokuP to Sudoku equal to 1 in 33,173 (rounded). That correlates very closely with random grid sampling predictions (1 in 33,000).

For various reasons I think it's still a useful exercise to try and verify that result, so I will carry on ...

Mathimagics
2017 Supporter

Posts: 447
Joined: 27 May 2015

### Re: SudokuP - Frequency Analysis

Yes ! The enumeration of all sudoku-P [ or sudoku DG] grids is tricky.

copied from the other thread .....

And yes - with a single row swap you can turn a sudoku-p compatible solution grid to an incompatible one

So maybe 1 in 33000 grids will come out as compatible on raw generation - and this is what Red Ed showed

But there are 6 row shops per band - If we say that we only need to swap 4 bands to keep box 1 canonical [band 2 and 3 and stack 2 and 3] - and I think dobrichev has already said this .... [ 6 ^4]

Lets ignore the 9! [3.6e5]
so 2e17/3.6e5 = 0.55 e12 = 5.5e11 "different" grids with box 1 canonical

5e9 ed grids - all with Box 1 canonical
6^4 = 1.2 e3
1200 isomorphs per ed grid = 6e12 grids with box 1 canonical

There are 5.5e11 sudoku-p grids with box 1 canonical

most grids wont have a sudoku-p grid
depending on the distribution , and some grids have more than one - this could explain the 1 in 15
coloin

Posts: 1662
Joined: 05 May 2005

### Re: SudokuP - Frequency Analysis

Ok, using dobrichev's suggestions above, and Red Ed Russel's figures for confirmation, we have reduced the SudokuP enumeration to 11 distinct sub-problems, as follows:

• we fix B1 = 123/456/789, then consider all digit templates for placement of 1's in the remaining blocks
• we need only consider templates that have a 1 in R2 of B2 (which forces R3 for B3), and a 1 in C2 of B4 (forcing C3 for B7). The other 3 possibilities are equivalent problems.

• This reduces the main problem by a factor of 4. and leaves us with 244 different templates, ie 244 sub-problems.
• These can be reduced to 11 by identifying P-isomorphisms, ie: isomorphisms using the restricted set of 2592 (3,359,232 / 36^2) different transformations that preserve property P). There are 11 equivalence classes.

The class counts and # of templates for each class are shown below, along with a representative template. Each is a distinct sub-problem whose solutions need to be counted.

Code: Select all
`Class #1 (size 1)=================== 1 2 3 . . . . . . 4 5 6 1 . . . . . 7 8 9 . . . 1 . . . 1 . . . . . . . . . . . 1 . . . . . . . . . . . 1 . . . 1 . . . . . . . . . . . 1 . . . . . . . . . . . 1`

Code: Select all
`Class 2 (size 18)================= 1 2 3 . . . . . . 4 5 6 1 . . . . . 7 8 9 . . . 1 . . . 1 . . . . . . . . . . . 1 . . . . . . . . . . . 1 . . . 1 . . . . . . . . . . . . . . 1 . . . . . 1 . . .`

Classes 3 to 11:
Hidden Text: Show
Code: Select all
`Class 3 (size 12)================== 1 2 3 . . . . . . 4 5 6 1 . . . . . 7 8 9 . . . 1 . . . 1 . . . . . . . . . . . 1 . . . . . . . . . . . 1 . . . . . . . . . 1 . . 1 . . . . . . . . . . . 1 . . .Class 4 (size 36)================== 1 2 3 . . . . . . 4 5 6 1 . . . . . 7 8 9 . . . 1 . . . 1 . . . . . . . . . . . 1 . . . . . . . . . . . . 1 . . . . . 1 . . . . . 1 . . . . . . . . . . . . . 1 .Class 5 (size 36)================== 1 2 3 . . . . . . 4 5 6 1 . . . . . 7 8 9 . . . 1 . . . 1 . . . . . . . . . . . . . . 1 . . . . . 1 . . . . . . . . . 1 . . . . . 1 . . . . . . . . . . . . . . 1Class 6 (size 72)================== 1 2 3 . . . . . . 4 5 6 1 . . . . . 7 8 9 . . . 1 . . . 1 . . . . . . . . . . . . . . . 1 . . . . 1 . . . . . . . . . 1 . . . . . . . . . . 1 . . . 1 . . . . . .Class 7 (size 9)================== 1 2 3 . . . . . . 4 5 6 1 . . . . . 7 8 9 . . . 1 . . . 1 . . . . . . . . . . . . . . . 1 . . . . . 1 . . . . . 1 . . . . . . . . . . . . . 1 . . . . . 1 . . . .Class 8 (size 4)================== 1 2 3 . . . . . . 4 5 6 1 . . . . . 7 8 9 . . . 1 . . . . . . . . . 1 . . 1 . . . . . . . . . . . 1 . . . . . . . . . 1 . . . . . . . . . . . 1 . . 1 . . . . . .Class 9 (size 36)================== 1 2 3 . . . . . . 4 5 6 1 . . . . . 7 8 9 . . . 1 . . . . . . . . . . 1 . 1 . . . . . . . . . . . 1 . . . . . . . . . . . 1 . . . 1 . . . . . . . . . . . 1 . . .Class 10 (size 18)================== 1 2 3 . . . . . . 4 5 6 1 . . . . . 7 8 9 . . . . 1 . . . . . . 1 . . . . 1 . . . . . . . . . . . . . 1 . . . . . . 1 . . . . . . . . . . . . 1 . . 1 . . . . . .Class 11 (size 2)================== 1 2 3 . . . . . . 4 5 6 . 1 . . . . 7 8 9 . . . . . 1 . . . . . 1 . . . . . . . . . 1 . . . 1 . . . . . . . . . . . . . . 1 . . . 1 . . . . . . . . . 1 . . . . .`

Mathimagics
2017 Supporter

Posts: 447
Joined: 27 May 2015

### Re: SudokuP - Frequency Analysis

Ok, job done!

Table of Class #, NE = number of equivalent classes, NG = number of grids, total = NE * NG:

Code: Select all
`   #   NE      SudokuP        Total SudokuP  ------------------------------------------   1    1    969,349,902         969,349,902   2   18    728,290,886      13,109,235,948   3   12    655,641,170       7,867,694,040   4   36    606,126,844      21,820,566,384   5   36    595,526,742      21,438,962,712   6   72    517,948,382      37,292,283,504   7    9    598,102,466       5,382,922,194   8    4    490,762,646       1,963,050,584   9   36    547,152,562      19,697,492,232  10   18    458,804,480       8,258,480,640  11    2    373,961,017         747,922,034  ------------------------------------------      244  6,541,667,097     138,547,960,174`

And 138,547,960,174 x 4 x 362,880 = 201,105,135,151,764,480
Last edited by Mathimagics on Wed Jan 10, 2018 9:23 am, edited 2 times in total.

Mathimagics
2017 Supporter

Posts: 447
Joined: 27 May 2015

### Re: SudokuP - Frequency Analysis

Hi, Mathimagics!
Mathimagics wrote:Ok, job done!

Code: Select all
`   #   NE      SudokuP        Total SudokuP  ------------------------------------------   1    1    969,349,902         969,349,902   2   18    728,290,886      13,109,235,948   3   12    655,641,170       7,867,694,040   4   36    606,126,844      21,820,566,384   5   36    595,526,742      21,438,962,712   6   72    517,948,382      37,292,283,504   7    9    598,102,466       5,382,922,194   8    4    490,762,646       1,963,050,584   9   36    547,152,562      19,697,492,232  10   18    458,804,480       8,258,480,640  11    2    373,961,017         747,922,034  ------------------------------------------      244  6,541,667,097     138,547,960,174`

And 138,547,960,174 x 4 x 362,880 = 201,105,135,151,764,480

Congratulations! Well (and very fast) done!
How much CPU time did you use?

Serg
Serg
2017 Supporter

Posts: 545
Joined: 01 June 2010
Location: Russia

Next