Sudoku: Given Isomorphisms.

Programs which generate, solve, and analyze Sudoku puzzles

Sudoku: Given Isomorphisms.

Postby MartinCHarvey » Sun Feb 28, 2021 6:44 pm

Sudoku: Given Isomorphisms.

Hi folks! Some investigation, and a couple of questions:

So, I have been a busy little bee, and have been trying to work out how many isomorphisms there are for a certain number of givens (pre-existing placements) – that is, initial configurations which are the same, modulo row, col, stack, band permutations, and reflection. This is just the given locations, not the contents of those cells.

This particular computational problem is not unlike that of calculating unavoidable sets.

I get the following numbers - first for a "trivial" 4x4 board, and then for a proper sudoku board.

Order2 board ( 4×4 cells):

There are 1 isomorphisms with 0 givens, order2
There are 1 isomorphisms with 1 givens, order2
There are 5 isomorphisms with 2 givens, order2
There are 10 isomorphisms with 3 givens, order2
There are 33 isomorphisms with 4 givens, order2
There are 53 isomorphisms with 5 givens, order2
There are 101 isomorphisms with 6 givens, order2
There are 122 isomorphisms with 7 givens, order2
There are 153 isomorphisms with 8 givens, order2
There are 122 isomorphisms with 9 givens, order2
There are 101 isomorphisms with 10 givens, order2
There are 53 isomorphisms with 11 givens, order2
There are 33 isomorphisms with 12 givens, order2
There are 10 isomorphisms with 13 givens, order2
There are 5 isomorphisms with 14 givens, order2
There are 1 isomorphisms with 15 givens, order2
There are 1 isomorphisms with 16 givens, order2

Order3 board ( 9×9 cells, “standard” sudoku board):

There are 1 isomorphisms with 0 givens, order3
There are 1 isomorphisms with 1 givens, order3
There are 5 isomorphisms with 2 givens, order3
There are 21 isomorphisms with 3 givens, order3
There are 109 isomorphisms with 4 givens, order3
There are 548 isomorphisms with 5 givens, order3
There are 3074 isomorphisms with 6 givens, order3
There are 16847 isomorphisms with 7 givens, order3
There are 92393 isomorphisms with 8 givens, order3

Now. That took me 12 CPU core hours. The magic number I want to get to with order3 boards is of course 17, the lowest number of givens in any standard sudoku puzzle. All of a sudden this looks like it will take some serious CPU grunt. I’m not exactly sure, but for small numbers of givens (less than 1/4 the cells on the board), it looks like adding one extra placement multiplies the number of isomorphisms by a factor of roughly 5.5.

With respect to computational cost, determining isomorphisms is reasonably fast (at least 200k per CPU core second on average), but since there’s no easy way of “sorting” that I know about, there’s no “log N” shortcut, and determining uniqueness requires an N^2 comparison cost, so the computational cost is proportional to the square of the number of isomorphisms.

Plugging these numbers into a spreadsheet, assuming (conservatively) 4 CPU cores per machine (your average laptop), I can then calculate the costs in thousand machine years:

Givens CPU Hours Thousand machine years
0 1.35899570417687E-09 3.87841239776503E-17
1 1.35899570417687E-09 3.87841239776503E-17
2 3.39748926044217E-08 9.69603099441258E-16
3 5.99317105541998E-07 1.71037986741438E-14
4 1.61462279613254E-05 4.60794176978463E-13
5 0.000408111845947 1.16470275669843E-11
6 0.012841797290722 3.66489648707831E-10
7 0.385712075584426 1.10077647141674E-08
8 11.601021233041 3.31079373089068E-07
9 350.930892299489 1.00151510359443E-05
10 10615.6594920595 0.000302958318837
11 321123.699634801 0.009164489144829
12 9713991.91395273 0.277225796631071
13 293848255.39707 8.3860803480899
14 8888909725.76137 253.67893052972
15 268889519204.281 7673.78764852402
16 8133907955929.51 232132.076367851
17 246050715666868 7021995.31012751

Now, as it turns out, distributed computing is my sort of thing, so I reckon getting up to about 12 givens (277 machine years) is pretty doable, but 17 looks somewhat out of the question, unless I can find a way to answer these two questions:

Currently I compare all boards against all other boards, with a couple of “tricks” for rapid comparison that removes the need to consider permutations in most cases. I could possibly re-apply some of those tricks to reduce the cost from N^2, but I’m not sure it’ll give anything more than a constant cost improvement. Probably won’t reduce the cost to N log N. (Details available after some discussion).
My math assumes a constant 5.5 increase for each extra given. Do we have any reasons to assume that that that number might decrease above 9 or 10 givens?
MartinCHarvey
 
Posts: 15
Joined: 22 February 2021

Re: Sudoku: Given Isomorphisms.

Postby Mathimagics » Mon Mar 01, 2021 7:35 am

Hello Martin!

If I understand correctly, you are looking to count the number of essentially different (ED) patterns for each number of givens. Two grids (puzzles, patterns) are ED if no isomorphism exists under the standard transformations (which you list).

A pattern is usually written in "dot-X" form, where the "X" denote givens. For example:

Code: Select all
..X...X...........X...X....................................X...X...........X...X.


You indicated that you have a problem with sorting patterns?

This problem goes away, I think, using canonical forms. Efficient algorithms exist for getting the minlex CF for patterns (puzzles, grids). This gives you both a comparison function (is pattern A isomorphic to pattern B?) and a basis for sorting your table of known patterns.

Using this function, and brute force, I enumerated all 3-given patterns, sorted the results, removed duplicates, and got 21 ED patterns (confirming your result):

3-given patterns: Show
Code: Select all
..............................................................................XXX
.............................................................................X.XX
..........................................................................X..X..X
.......................................................................X.......XX
.......................................................................X......XX.
.......................................................................X.....X..X
.......................................................................X.....X.X.
.......................................................................X....XX...
.......................................................................X..X..X...
..............................................................X........X.....X...
..............................................................X.......X.......X..
..............................................................X.......X......X...
..............................................................X.....X.....X......
.....................................................X.......................X..X
.....................................................X.......................X.X.
.....................................................X......................XX...
.....................................................X....................X..X...
.....................................................X................X......X...
.....................................................X..............X.......X....
.....................................................X..............X.....X......
..........................X.......................X.......................X......


Now patterns aren't my area of expertise, but others here have done extensive work with them, and can probably tell us whether any work has been done previously on counting the (order 3) patterns.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Sudoku: Given Isomorphisms.

Postby MartinCHarvey » Mon Mar 01, 2021 8:52 pm

Hi!

Yes, you understand correctly, I'm looking for essentially different patterns.

Yes, I think canonical forms will do the trick. Given the representation you suggest, this appears to be essentially "cells right-most" - i.e. Sort by band, then stack, then row, then col, such that the cells or in the "lowest" or "highest" possible position they will be.

For the purposes of fast comparison, whilst I'm doing that it's also easy to calculate what I shall call a "signature", which is a set of cellcounts for the bands / stacks / rows / cols. This then makes the initial comparison just a 27 bit compare, which very rapidly eliminates most of the cases.

Reversing a bit to the problem I originally want to solve (17 given sudokus), it's clear that looking for isomorphisms is probably the hard way. The (minimal) set of all grids differs only by number permutation, and the set of all essentially different (17 given) configurations differs only by transposition, but the product of those two sets does not contain all possible 17-clue sudokus. So, I have the problem the wrong way round.

I'm guessing I would be better off building a copy of checker into a distributed app, and then running it on all possible solution grids. I notice it's all fancily SSE/AVX optimised, and distributed infrastructure is my kind of thing.

I assume it's trivial to get checker to find all 17 clue puzzles in any given grid?

MH.
MartinCHarvey
 
Posts: 15
Joined: 22 February 2021

Re: Sudoku: Given Isomorphisms.

Postby Mathimagics » Tue Mar 02, 2021 7:08 am

Reversing a bit to the problem I originally want to solve (17 given sudokus), it's clear that looking for isomorphisms is probably the hard way.

Ok, let's talk about that then! 8-)

I'm guessing I would be better off building a copy of checker into a distributed app, and then running it on all possible solution grids. I notice it's all fancily SSE/AVX optimised, and distributed infrastructure is my kind of thing.

I assume it's trivial to get checker to find all 17 clue puzzles in any given grid?

I assume that checker refers to the original program used by Red Ed Russell in the project that established "There is No 16-clue Sudoku"?

Sadly, it is by no means trivial to use the same approach to find all the 17-clue puzzles. The methodology of checker is "hitting sets + UA's" (hence the SSE/AVX bitmap stuff). The cost per grid rises dramatically as the clue-count is raised. So what took say, 100 core-years, for the 16-clue case, would take X times 100 years. Not exactly sure what the factor X is here, but I suspect that it's at least 10 ... :(

You will pleased to know the following, however:

  • an alternate methodology has been developed which can solve the 17-clue problem in time roughly X = 2. This methodology is different, instead of hitting-sets, it uses patterns !!
  • there is a co-ordinated, ongoing search project to complete the 17-clue search. This is roughly 50% complete. The project manager is user champagne. Currently the only other contributor (cores) is myself.

The relevant thread can be found HERE.

The main point here is that, given the current core contribution levels (8 for champagne, 16 for myself), we are perhaps 4 years away from completion (a rough estimate), and so we are very, very interested in further assistance, to reduce that time. ;)
Last edited by Mathimagics on Tue Mar 02, 2021 10:16 am, edited 2 times in total.
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Sudoku: Given Isomorphisms.

Postby JPF » Tue Mar 02, 2021 8:24 am

Mathimagics wrote:Hello Martin!
If I understand correctly, you are looking to count the number of essentially different (ED) patterns for each number of givens. Two grids (puzzles, patterns) are ED if no isomorphism exists under the standard transformations (which you list).
...
Now patterns aren't my area of expertise, but others here have done extensive work with them, and can probably tell us whether any work has been done previously on counting the (order 3) patterns.

Here is an answer to your question:
http://forum.enjoysudoku.com/number-of-non-equivalent-patterns-having-n-clues-t30540.html

There are 21 3-digit ED-patterns and... 55,113,078,988 17-digit ED-patterns.

JPF
JPF
2017 Supporter
 
Posts: 6127
Joined: 06 December 2005
Location: Paris, France

Re: Sudoku: Given Isomorphisms.

Postby MartinCHarvey » Tue Mar 02, 2021 7:20 pm

Mathimagics wrote:
I assume that checker refers to the original program used by Red Ed Russell in the project that established "There is No 16-clue Sudoku"?


Yes, that was the one, but as you say, no use to us.

The main point here is that, given the current core contribution levels (8 for champagne, 16 for myself), we are perhaps 4 years away from completion (a rough estimate), and so we are very, very interested in further assistance, to reduce that time. ;)


Well, if you're very confident in the code, and it requires minimal / none manual intervention, and you need a lot more CPU grunt, then Distributed.net are always looking for new projects. If all you need is 100 odd CPU years, they could probably finish it as an alpha test. Someone would need to write the client code for them tho.

I'll rewrite my code with canonical forms, and run it for a few days, because I'm interested in how many isomorphisms I can enumerate, but after that, yes I could probably donate another four cores or so to your search.
MartinCHarvey
 
Posts: 15
Joined: 22 February 2021

Re: Sudoku: Given Isomorphisms.

Postby MartinCHarvey » Tue Mar 02, 2021 7:25 pm

JPF wrote:Here is an answer to your question:
http://forum.enjoysudoku.com/number-of-non-equivalent-patterns-having-n-clues-t30540.html

There are 21 3-digit ED-patterns and... 55,113,078,988 17-digit ED-patterns.

JPF


Thank you - my discrete maths is a little rusty (The MA was 20 years ago, and I transferred to CompSci) - so I'll read up on conjugacy classes. I'm assuming no-one's actually come close to enumerating all those patterns!

MH.
MartinCHarvey
 
Posts: 15
Joined: 22 February 2021

Re: Sudoku: Given Isomorphisms.

Postby JPF » Tue Mar 02, 2021 9:39 pm

Mathimagics wrote:I assume that checker refers to the original program used by Red Ed Russell in the project that established "There is No 16-clue Sudoku"?

actually the study was made by Gary McGuire* et al., see here.

JPF

* Moschopulus on this forum
JPF
2017 Supporter
 
Posts: 6127
Joined: 06 December 2005
Location: Paris, France

Re: Sudoku: Given Isomorphisms.

Postby MartinCHarvey » Wed Mar 10, 2021 12:39 am

Mathimagics wrote:Hello Martin!
This problem goes away, I think, using canonical forms.


Many thanks for you help - I have found a nice way to compute a canonical form which is not minlex, but is very quick and easy to compute, and effectively detects isomorphisms. Once I can detect isomorphisms, putting them in minlex order is trivial: whenever I find two grids of the same isomorphism, I just pick the one with the lowest representation.

MH.
MartinCHarvey
 
Posts: 15
Joined: 22 February 2021

Re: Sudoku: Given Isomorphisms.

Postby Mathimagics » Wed Mar 10, 2021 4:53 am

JPF wrote:actually the study was made by Gary McGuire* et al.


Hi JPF!

Oops, I have made that same mistake before, I am sure! :?

MartinCHarvey wrote:Many thanks for you help - I have found a nice way to compute a canonical form which is not minlex, but is very quick and easy to compute, and effectively detects isomorphisms.


You're welcome!

Can you describe your canonical form algorithm?

The current gold standard for obtaining minlex grid canonical forms clocks in at roughly 750,000 grids/second (4.2Ghz CPU, method = "optimised Deverin") ...

Cheers
MM
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Sudoku: Given Isomorphisms.

Postby MartinCHarvey » Wed Mar 10, 2021 10:56 pm

Mathimagics wrote:You're welcome!

Can you describe your canonical form algorithm?



Yes, it's quite simple.
- For each row / col / band / stack count the filled cells in that item.
- Sort the counts appropriately high to low.
- Find the permutations that give you those ordered counts.
- (Remember to remove duplicates that swap empty or full rows etc).
- Use the combinations to generate a lex form for each (and the reflection) - typically there aren't many.
- Pick the smallest.

It always spots isomorphisms because the results of the sorting are the same for isomorphic grids, and you end up checking permutations where two pairs of cells are swapped row/col.


The current gold standard for obtaining minlex grid canonical forms clocks in at roughly 750,000 grids/second (4.2Ghz CPU, method = "optimised Deverin") ...

Cheers
MM


I haven't optimised it yet, it's prob not *that* quick at the moment.

Now that I've done all that, the algo that would seem to give a minlex form that comes to mind is:

1. Take each block (2x2), (3x3)
2. Set all possible permutations of row/col stack/block as possible (set).
3. Find perm combinations which give minlex for each block (rows / cols).
(rows sorted by cellcount, col perms set to shift bits left in bottom row,
then next row, etc - subject to constraints.
4. Put that block highest, and constrain permutations appropriately.
4a) If ambigous which block, need to consider all possible blocks you could put
highest, so you might need to recurse and backtrack...
5. Repeat procedure for remaining blocks, subject to sets of row/col perms still
allowed .... And subject to block orderings still allowed.

but I have a day job, and have not got round to coding that yet!
MartinCHarvey
 
Posts: 15
Joined: 22 February 2021

Re: Sudoku: Given Isomorphisms.

Postby Mathimagics » Thu Mar 11, 2021 4:59 am

I wrote:The current gold standard for obtaining minlex grid canonical forms clocks in at roughly 750,000 grids/second (4.2Ghz CPU, method = "optimised Deverin") ...

This refers to CF's for solution (complete) grids, so is totally irrelevant to what you are doing! :roll:

Sorry about that! For puzzles (and patterns), the CF algorithms are quite different, and I don't have any benchmark figures for these :?

Anyway, you seem to be having fun rolling your own code, so carry on! 8-)
User avatar
Mathimagics
2017 Supporter
 
Posts: 1926
Joined: 27 May 2015
Location: Canberra

Re: Sudoku: Given Isomorphisms.

Postby dobrichev » Thu Mar 11, 2021 10:48 am

Let mention here that nothing is done on pencilmarks canonicalization :)
dobrichev
2016 Supporter
 
Posts: 1850
Joined: 24 May 2010


Return to Software