Sudoku Enumeration Problem

Everything about Sudoku that doesn't fit in one of the other sections

Sudoku Enumeration Problem

Postby Mathimagics » Thu Jan 04, 2018 9:40 am

From The Importance of Automorphic Puzzles
Code: Select all
 6,671,248,172,291,458,990,080  5,472,730,538 * 3,359,232 * 362,880
 6,670,903,752,021,072,936,960  exact number of Sudoku solution grids
 -----------------------------
       344,420,270,386,053,120  grid deficiency

                 5,472,730,538  exact number of unique Sudoku solution grids
                     3,359,232  possible geometric permutations of solution grid
                       362,880  number of ways to renumber a grid


Given row 1 = 123456789, then there are, I think, 150 possible settings for column 1 that produce canonical (minlex) solutions. The first setting is:
Code: Select all
 1 2 3 | 4 5 6 | 7 8 9
 4 . . | . . . | . . .
 5 . . | . . . | . . .
 ---------------------
 2 . . | . . . | . . .
 3 . . | . . . | . . .
 6 . . | . . . | . . .
 ---------------------
 7 . . | . . . | . . .
 8 . . | . . . | . . .
 9 . . | . . . | . . .


And the last (in lex order) is:
Code: Select all
 1 2 3 | 4 5 6 | 7 8 9
 8 . . | . . . | . . .
 9 . . | . . . | . . .
 ---------------------
 2 . . | . . . | . . .
 6 . . | . . . | . . .
 7 . . | . . . | . . .
 ---------------------
 3 . . | . . . | . . .
 4 . . | . . . | . . .
 5 . . | . . . | . . .


My question (and I sense the possibility of public humiliation here but what the heck!) is this - if I count all the solutions for each of these 150 grid patterns, shouldn't I get 5,472,730,538?

The first handful of col 1settings that I tested produced these numbers:

  • col 1 = 145237689, solutions = 744,603,884
  • col 1 = 145238679, solutions = 805,608,877
  • col 1 = 145239678, solutions = 872,751,263
  • col 1 = 145267389, solutions = 1,336,808,742
  • col 1 = 145268379, solutions = 1,358956,881

That's nearly 5 billion in just the first 5 grid patterns, so clearly there is something fundamentally wrong with my approach. WTF?? :?
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Sudoku Enumeration Problem

Postby David P Bird » Thu Jan 04, 2018 10:21 am

I think that what you are overlooking is that the solutions you find are not guaranteed to be canonical. If they were canonicalised, this would produce duplicates.
.
David P Bird
2010 Supporter
 
Posts: 1043
Joined: 16 September 2008
Location: Middle England

Re: Sudoku Enumeration Problem

Postby Mathimagics » Thu Jan 04, 2018 11:29 am

David, I suppose that merely specifying the first row and column to achieve a lex ordering does not preclude the existence of some transformation that has a lower lex value.

Can I use GridCheck (I have v1.15 which works on my CPU) to produce a minlex form?
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Sudoku Enumeration Problem

Postby champagne » Thu Jan 04, 2018 11:43 am

to reach the final count, you have to clear many redundancy cases, in your process stacks 2/3 perms would already clear many cases.
As far as I know, all ED solution generators start from the 416 ED possible band 1 having
123456789
45
as common start

But then, you still have to clear many cases.

The easiest way is to test morphs against a canonical form.

in the 17 clues sudoku search, part of the work is done with a code creating one canonical set of solution grids. (not the classical min lexical canonical form)

EDIT you can find the code here
champagne
2017 Supporter
 
Posts: 7455
Joined: 02 August 2007
Location: France Brittany

Re: Sudoku Enumeration Problem

Postby David P Bird » Thu Jan 04, 2018 12:14 pm

I mainly run GridChecker to check that the puzzles I attempt are valid, and have little appreciation of its other wierd and wonderful options. Dobrichev seems to be away from base at the moment - probably living it up somewhere warm -, but perhaps someone else can advise. If not, I suggest you post a query on the GridChecker thread.
David P Bird
2010 Supporter
 
Posts: 1043
Joined: 16 September 2008
Location: Middle England

Re: Sudoku Enumeration Problem

Postby Mathimagics » Thu Jan 04, 2018 1:09 pm

Like most good software tools, GridChecker help is obscure, and basic operation remains a mystery! 8-)

The thread is so long that it's hard to follow, too

But meanwhile I thought that you would be interested to know that while doing my enumeration above, I counted the number that have property P (ie different values in corresponding block positions), and it seems that 1 in 15 have this property.

So we have a rough estimate of the number of unique SudokuP grids, ie 5,472,730,538 / 15, roughly 365 million.

But this assumes that the geometrical transformations that work for Sudoku are all valid for SudokuP. This is not the case, eg swapping rows within bands is likely to disturb the position matching.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Sudoku Enumeration Problem

Postby Serg » Thu Jan 04, 2018 2:15 pm

Hi, Mathimagics!
Mathimagics wrote:Like most good software tools, GridChecker help is obscure, and basic operation remains a mystery! 8-)
The thread is so long that it's hard to follow, too

GridChecker is multi-purpose tool, canonicalisation of grids/puzzles/patterns is just a little bit of its functionality. But these are 2 simple examples of GridChecker's usage.

1. GridChecker canonicalizes given grids/puzzles/patterns and writes canonicalized grids/puzzles/patterns to the output (order of output grids/puzzles/patterns corresponds to input order):

gridchecker --similar --subcanon <input_file >output_file

2. GridChecker canonicalizes given grids/puzzles/patterns, sorts canonicalized grids/puzzles/patterns (order of grids/puzzles/patterns may be changed), removes duplicates and writes resulting canonicalized grids/puzzles/patterns to the output. If you use GridChecker in this mode and get in the output file less grids/puzzles/patterns, than input file contains - you can be sure that input file contains several grids/puzzles/patterns being equivalent to each other. Here is an example of usage:

gridchecker --similar --subcanon --puzzles input_file --mspuzzles output_file

That's all you need to know about GridCheker. It is very fast, is capable of processing about million of the puzzles even in sorting mode. Reliable, very useful tool!

Mathimagics wrote:But meanwhile I thought that you would be interested to know that while doing my enumeration above, I counted the number that have property P (ie different values in corresponding block positions), and it seems that 1 in 15 have this property.

So we have a rough estimate of the number of unique SudokuP grids, ie 5,472,730,538 / 15, roughly 365 million.

It would be better to publish this message in the thread "Transversals in Sudoku Squares"...
Do you want to say that among 15 random grids one grid is SudokuP grid? I think SudokuP grids should be more rare than orthogonal grids. You reported than 1 grid only from 20000 random grids is orthogonal. Contradiction :?

Some words about this thread's theme. Your approach to enumerating essentially different sudoku solution grids is wrong. It looks like you understand it at the moment. If you need some clarification - feel free to post questions.

Serg
Serg
2018 Supporter
 
Posts: 890
Joined: 01 June 2010
Location: Russia

Re: Sudoku Enumeration Problem

Postby Mathimagics » Thu Jan 04, 2018 3:28 pm

Hi Serg,

Thanks very much for your "GridChecker for Dummies" guide! 8-)

I will definitely try it out.

serg wrote:Your approach to enumerating essentially different sudoku solution grids is wrong.

Amen to that! :?

serg wrote:Do you want to say that among 15 random grids one grid is SudokuP grid?

Yes, that's what I'm seeing in the enumeration I described above.
serg wrote:I think SudokuP grids should be more rare than orthogonal grids. You reported than 1 grid only from 20000 random grids is orthogonal. Contradiction :?

That was based on grids produced with my random Sudoku generator. Using the same method suggested 1 in 35000 are SudokuP, which may or may not support your conjecture.

I suspect my RSG is biased, based on the 1 in 15 SudokuP I'm seeing from the enumeration above. That would not surprise me greatly as it's basically a Pittenger/Jacobsen-Matthews hack.

I will need to check the orthogonality in the enumeration sample and see what that turns up.

Is there a more reliable source of certified random sudoku grid generation?

Cheers!
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Sudoku Enumeration Problem

Postby StrmCkr » Thu Jan 04, 2018 4:15 pm

Code: Select all
123|456|789
45.|...|...
...|...|...
-----------
2..|...|...
...|...|...
...|...|...
-----------
...|...|...
...|...|...
...|...|...


this is the starting position for generating every different possible Grid, in a NON random fashion.

Code: Select all
123|456|789
45*|***|***
***|...|...
-----------
2**|...|...
...|***|***
***|...|...
-----------
...|...|...
...|...|...
***|***|***


cycling all permutations of triples in each mini row on the "." positions is enough to generate every possible grid.

{remembering permutations counts per sub lvl of cycle is reduced by line of sight.}

all the * positions are filled in via Box line reduction and are singles,you don't need to generate these.

sequenced below. {approximately}
Code: Select all
-----------
...|...|...
...|...|...
...|111|222
-----------
...|333|444
777|...|...
...|555|666
-----------
121212|888|101010
131313|999|111111
...|...|...


hand built sequence followed from above... {not including the updated blr that my solver does, but the singles are pretty clear}
Code: Select all
.-----------.---------.---------.
| 1  2   3  | 4  5  6 | 7  8  9 |
| 4  5   68 | 8  9  7 | 2  3  1 |
| 9  7   8  | 3  1  2 | 6  4  5 |
:-----------+---------+---------:
| 2  13  1  | 5  6  4 | 8  9  7 |
| 8  9   7  | 1  2  3 | 5  6  4 |
| 5  6   46 | 9  7  8 | 3  1  2 |
:-----------+---------+---------:
| 7  8   9  | 2  3  1 | 4  5  6 |
| 6  4   5  | 7  8  9 | 1  2  3 |
| 3  13  12 | 6  4  5 | 9  7  8 |
'-----------'---------'---------'

Code: Select all
12345678945..........3126452..564897897.........978312789231456645789123.........'
Last edited by StrmCkr on Tue Jan 09, 2018 2:19 am, edited 6 times in total.
Some do, some teach, the rest look it up.
stormdoku
User avatar
StrmCkr
 
Posts: 1430
Joined: 05 September 2006

Re: Sudoku Enumeration Problem

Postby Serg » Thu Jan 04, 2018 4:22 pm

Hi, Mathimagics!
Mathimagics wrote:Is there a more reliable source of certified random sudoku grid generation?

I am not an expert in unbiased random sudoku grid generation. Years ago hard battles took place at this forum about unbiased grid/puzzle generation. You can find several threads concerning this question at this forum. (See, for example, Denis Berthier's book "Pattern-Based Constraint Satisfaction and Logic Puzzles" (Chapter 6), available by link at the page http://forum.enjoysudoku.com/pattern-based-constraint-satisfaction-and-logic-puzzles-t30774.html#p225832.) I suspect that well-known tools - GridChecker and "sudoku" tool by gsf have capability of (unbiased) grids/puzzles generation, but I don't known appropriate "key sequences". So, I use domestic random grid generator.

Serg
Serg
2018 Supporter
 
Posts: 890
Joined: 01 June 2010
Location: Russia

Re: Sudoku Enumeration Problem

Postby eleven » Thu Jan 04, 2018 6:32 pm

Mathimagics wrote:Is there a more reliable source of certified random sudoku grid generation?

Red Ed has published the code for an unbiased grid generator here. And i see, that you answered in this thread ...
For most purposes a primitive random grid generation is good enough (e.g. just add random numbers, check, if they still have a solution and if not take another one, until you get a unique solution grid).
The bias then can never explain a difference of 1:15 vs 1:35000.
eleven
 
Posts: 3151
Joined: 10 February 2008

Re: Sudoku Enumeration Problem

Postby Mathimagics » Fri Jan 05, 2018 10:23 pm

Thanks eleven for reminding me of that, I had totally forgotten! :?

Using Red Ed's generator seems to confirm that roughly 1 in 30,000 random grids have property P, and 1 in 20,000 have property O (orthogonal pair exists), so serg's conjecture seems to be confirmed.

I am looking into the cases where my enumeration process apparently produces 1 in 15 grids having property P. Something is going on there!
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Sudoku Enumeration Problem

Postby coloin » Mon Jan 08, 2018 2:12 pm

Mathimagics wrote:The first handful of col 1settings that I tested produced these numbers:

  • col 1 = 145237689, solutions = 744,603,884
  • col 1 = 145238679, solutions = 805,608,877
  • col 1 = 145239678, solutions = 872,751,263
  • col 1 = 145267389, solutions = 1,336,808,742
  • col 1 = 145268379, solutions = 1,358956,881

That's nearly 5 billion in just the first 5 grid patterns, so clearly there is something fundamentally wrong with my approach. WTF?? :?


A long time ago I was surprised that there were only 5 essentially different row and column combinations [not 150 !] - it looks like you found all 5
I found that 9 more clues are needed to complete to a puzzle [i could not find 8 clues]
Of course there are 81 of these combinations in each solution grid - so most solution grids should have at least one of these 5 - are your figures correct ?
coloin
 
Posts: 2494
Joined: 05 May 2005
Location: Devon

Re: Sudoku Enumeration Problem

Postby Mathimagics » Mon Jan 08, 2018 3:16 pm

Hi coloin,

My whole approach to counting SudokuP's by testing R1/C1 solutions turned out to be totally dodgy.

My figures above are dodgy also, eg: we now know that random Sudoku grids are SudokuP on average really is more like 1 in 33000, not the 1 in 15 I encountered above.

So my estimated number of SudokuP's is way out.

It took some time to work out how and why Felgenhauer & Jarvis band-based enumeration system works but I finally got it.

Counting band-based solutions rather than R1/C1 does, however, exhibit the same trend, namely that depending on the band being tested, the ratio of SudokuP vs Sudoku solutions for that band varies wildly, from (1 in 15) to (1 in 5 million) ...

My progress is now reported here
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Sudoku Enumeration Problem

Postby coloin » Mon Jan 08, 2018 3:52 pm

interestingly, here are the 5 different rowcols

the approx incidence in 2430 rowcols was
Code: Select all
000000001000000002000000003000000004000000005000000006000000007000000008123478569 - 80
000000001000000002000000003000000004000000005000000006000000007000000008124378569 -162
000000001000000002000000003000000004000000005000000006000000007000000008124578369 -980
000000001000000002000000003000000004000000005000000006000000007000000008127458369 -557
000000001000000002000000003000000004000000005000000006000000007000000008147258369 -651
                                                                                     
                                                                                  2430
coloin
 
Posts: 2494
Joined: 05 May 2005
Location: Devon

Next

Return to General