Sudoku: Given Isomorphisms.
Hi folks! Some investigation, and a couple of questions:
So, I have been a busy little bee, and have been trying to work out how many isomorphisms there are for a certain number of givens (pre-existing placements) – that is, initial configurations which are the same, modulo row, col, stack, band permutations, and reflection. This is just the given locations, not the contents of those cells.
This particular computational problem is not unlike that of calculating unavoidable sets.
I get the following numbers - first for a "trivial" 4x4 board, and then for a proper sudoku board.
Order2 board ( 4×4 cells):
There are 1 isomorphisms with 0 givens, order2
There are 1 isomorphisms with 1 givens, order2
There are 5 isomorphisms with 2 givens, order2
There are 10 isomorphisms with 3 givens, order2
There are 33 isomorphisms with 4 givens, order2
There are 53 isomorphisms with 5 givens, order2
There are 101 isomorphisms with 6 givens, order2
There are 122 isomorphisms with 7 givens, order2
There are 153 isomorphisms with 8 givens, order2
There are 122 isomorphisms with 9 givens, order2
There are 101 isomorphisms with 10 givens, order2
There are 53 isomorphisms with 11 givens, order2
There are 33 isomorphisms with 12 givens, order2
There are 10 isomorphisms with 13 givens, order2
There are 5 isomorphisms with 14 givens, order2
There are 1 isomorphisms with 15 givens, order2
There are 1 isomorphisms with 16 givens, order2
Order3 board ( 9×9 cells, “standard” sudoku board):
There are 1 isomorphisms with 0 givens, order3
There are 1 isomorphisms with 1 givens, order3
There are 5 isomorphisms with 2 givens, order3
There are 21 isomorphisms with 3 givens, order3
There are 109 isomorphisms with 4 givens, order3
There are 548 isomorphisms with 5 givens, order3
There are 3074 isomorphisms with 6 givens, order3
There are 16847 isomorphisms with 7 givens, order3
There are 92393 isomorphisms with 8 givens, order3
Now. That took me 12 CPU core hours. The magic number I want to get to with order3 boards is of course 17, the lowest number of givens in any standard sudoku puzzle. All of a sudden this looks like it will take some serious CPU grunt. I’m not exactly sure, but for small numbers of givens (less than 1/4 the cells on the board), it looks like adding one extra placement multiplies the number of isomorphisms by a factor of roughly 5.5.
With respect to computational cost, determining isomorphisms is reasonably fast (at least 200k per CPU core second on average), but since there’s no easy way of “sorting” that I know about, there’s no “log N” shortcut, and determining uniqueness requires an N^2 comparison cost, so the computational cost is proportional to the square of the number of isomorphisms.
Plugging these numbers into a spreadsheet, assuming (conservatively) 4 CPU cores per machine (your average laptop), I can then calculate the costs in thousand machine years:
Givens CPU Hours Thousand machine years
0 1.35899570417687E-09 3.87841239776503E-17
1 1.35899570417687E-09 3.87841239776503E-17
2 3.39748926044217E-08 9.69603099441258E-16
3 5.99317105541998E-07 1.71037986741438E-14
4 1.61462279613254E-05 4.60794176978463E-13
5 0.000408111845947 1.16470275669843E-11
6 0.012841797290722 3.66489648707831E-10
7 0.385712075584426 1.10077647141674E-08
8 11.601021233041 3.31079373089068E-07
9 350.930892299489 1.00151510359443E-05
10 10615.6594920595 0.000302958318837
11 321123.699634801 0.009164489144829
12 9713991.91395273 0.277225796631071
13 293848255.39707 8.3860803480899
14 8888909725.76137 253.67893052972
15 268889519204.281 7673.78764852402
16 8133907955929.51 232132.076367851
17 246050715666868 7021995.31012751
Now, as it turns out, distributed computing is my sort of thing, so I reckon getting up to about 12 givens (277 machine years) is pretty doable, but 17 looks somewhat out of the question, unless I can find a way to answer these two questions:
Currently I compare all boards against all other boards, with a couple of “tricks” for rapid comparison that removes the need to consider permutations in most cases. I could possibly re-apply some of those tricks to reduce the cost from N^2, but I’m not sure it’ll give anything more than a constant cost improvement. Probably won’t reduce the cost to N log N. (Details available after some discussion).
My math assumes a constant 5.5 increase for each extra given. Do we have any reasons to assume that that that number might decrease above 9 or 10 givens?