The New Sudoku Players' Forum

by **m_b_metcalf** » Fri Mar 23, 2007 2:01 am

I'm leaving for a two-week holiday tomorrow, so thought to leave some questions I've wondered about and that the mathematicians among you might have an opinion about (like: "He's nuts!").

Imagine a hyperspace of 81 dimensions. Each axis extends from 0 to 9. Take any minimal, valid sudoku puzzle in its canonical form. It can be represented as a point in this sudoku space.

I have programs that fiddle around with puzzles: changing values, deleting cells, binding in a new cell so that the result becomes minimal. When I observe these programs working, I notice that some puzzles can be readily and quickly transformed into a multitude of others. These new variations (once isomorphs have been deleted), are neighbours of the original in sudoku space. Other puzzles are difficult, if not impossible, to vary, and those with few variations often take long to process in this fashion. Such puzzles appear isolated in sudoku space.

Now to the questions. Our universe consists of galaxies, clusters of galaxies etc. Does the set of valid sudoku puzzles also form shapes, hyperplanes or branes in sudoku space? If so, could these be identified using, say, cluster analysis or principal-components analysis? And how, if at all, might that be linked to 'difficulty'?

Any comments are most welcome.

Regards,

Mike Metcalf

by **JPF** » Fri Apr 13, 2007 3:07 pm

m_b_metcalf wrote:
coloin wrote:It is an intriqing concept "the inner and outer regions of "sudoku space"" !
Yes, but nobody seemed to take it up (or even rubbish it). I'd be perfectly happy to share the Fields Medal [edit] with a collaborator

So let's go for the Fields Medal.

coloin wrote:Im not particularly good at these rare patterns to order........the best I got was 64 solutions !
Code: Select all
5...9.....1.8.6.....3...5...8.....4.9.......3.5.....9...6...1.....3.2.7.....7...9
Are there really no more unique puzzles with this pattern ?

Here are 3 stars of the sudoku space.
They are part of the valid galaxy.
In this galaxy, they are in the same set (same pattern)

Star #1

Code: Select all: 1 . . . 2 . . . . . 3 . 4 . 1 . . . . . 5 . . . 6 . . . 1 . . . . . 2 . 2 . . . . . . . 7 . 4 . . . . . 5 . . . 6 . . . 1 . . . . . 7 . 8 . 9 . . . . . 9 . . . 2

Star #2

Code: Select all: 1 . . . 2 . . . . . 3 . 4 . 5 . . . . . 5 . . . 6 . . . 2 . . . . . 4 . 6 . . . . . . . 7 . 1 . . . . . 3 . . . 7 . . . 2 . . . . . 8 . 6 . 5 . . . . . 1 . . . 3

Star #3

Code: Select all: 1 . . . 2 . . . . . 3 . 4 . 5 . . . . . 6 . . . 7 . . . 7 . . . . . 3 . 5 . . . . . . . 2 . 8 . . . . . 5 . . . 2 . . . 6 . . . . . 3 . 7 . 9 . . . . . 6 . . . 1

What are the distances between the 3 stars ?
Which is the closest to #1 ? etc...

JPF

by **m_b_metcalf** » Sat Apr 14, 2007 4:08 pm

JPF wrote:What are the distances between the 3 stars ?
Which is the closest to #1 ? etc...

JPF

Hypothesis: first you must put them into canonical form. Then count the number of differences. That is the distance between them (the values themselves, being only symbols, having no numereical relevance).

Regards,

Mike Metcalf

by **coloin** » Thu Apr 19, 2007 12:25 am

Indeed the analogy with space is quite a good one, and the ideas of many dimensions needed to visualize a point representing a puzzle is a fine concept ! I have been ruminating on this for a bit - and waiting for others to comment !

Many have commented on puzzles which are in the same "region" within a specific grid - eg in the search for 17 clue puzzles they only were found in one specific "region" within a grid.
I have often wondered what a region was......
Is it puzzles which share the same clues ?
Laterly I surmised that it was puzzles which share many non-clues [empty cells]
And most probably it is a combination of the two.....

However when we consider "sudoku space" it is slightly more reassuring that the number of minimal puzzles is finite [although at present incalcuable!][and thankfully not expanding]!

So how do we theoretically group our [minimal] puzzles ?

Number of given clues
Puzzle pattern
Puzzles from one essentially different grid
Puzzles from a specific grid
Isomorphic puzzles *

[* I dont think you can ignore this fact by only considering canonicalized puzzles - Two completly "distant" puzzles in sudoku space could have an isomorphic puzzle next door to one another !]

Now when comparing two random puzzles if they both have 27 clues each, with 54 empty spaces each,on average 9 of these clues will coincide, 18 clues from each puzzle will not coincide and 36 empty spaces or holes will coincide. Perhaps only one 1 of these 9 given clues will have the same clue number !

I generated 16 random 27 clue puzzles and compared pairs [manually] and this is probably true !

Code: Select all: 5..2....4.4....972..7...5.......2..34.135.86....6.....1.3.4..2.9....5..88.....7.. same cell identical clue .7.6..4.....7....2..3.9817..16.7.5.3..5.......8.56..1.6...542.88..2.............. 10 2 ..8........9..3.....671829.5........6....98.1.94.....7.234..1.58.....3.61....7... 12 3 ..7.........6.3...1.8..52.6.....28...5.3.7.4.......5176.3......2..93...57..2.49.. 59....4..8.64.5..1...6.....4...2195...8...1.3...8...2..8.........2.1...73...8.2.5 10 1 ...43.5...9.1.57...6..9..4..76...8..2.986.......5..3....2..4..7.37......9...7.6.. .7.4..53......9..63..1....969.........3.7.9....2..5..3..4.263..7...1..9...8....45 7 1 .3.....18.6214.3....8.3.4.....692.3..59.1......6.....2....2..6.........5.7.5.12.. .......6.......5.35.3.1..4.39..4...617..9..5..458....9.....4...9...2.....523.71.. 9 0 1....6........3.1..4...2.59.2.81...3.1...928.6........26......1....3..97.7..9.5.8 ...1.268.1...3742....6...5...2......6.8.....7..496......62..53.8...96.7.3........ 10 2 .....4..2..8......27.68..4...612..9.1..4....83.....16....9...7.8.3.....976..5..2. ..5..67...169...5..8..51....3...24....4.....6.....3.12.....4.9.15.87....9.....13. 7 1 .1....9.47..32.......4.....5..2............26..3.951..8.6...4.....96.2.525.7.46.. ........81....5..7.4.716.9...457...92....4....7..9.4...2....95.....6..71..31.9... 8 1 ..1..69...5..7..3...2...4562.6...1.8.1.46........9......7......1..7...24.63..85..

But this small experiment hides the fact that you have 10^12 equivalent permutations of each individual puzzle so that there is a high likelihood that there will be a much greater coincidence of clues in any two random puzzles .

I wonder what the extent of this would be - Possibly at least 8 clues could be mapped exactly !

I note JPF has found valid puzzles of our pattern - canonicalized - on average there are 6 common clues..... it might be possible to get even better convergence by iiterating through the the 9! possibilitities times the 2^3 ? symmetries.

m_b_metcalf wrote:Now to the questions. Our universe consists of galaxies, clusters of galaxies etc. Does the set of valid sudoku puzzles also form shapes, hyperplanes or branes in sudoku space? If so, could these be identified using, say, cluster analysis or principal-components analysis? And how, if at all, might that be linked to 'difficulty'?

It would be interesting to hear a reasonable response to this question

C

by **JPF** » Thu Apr 19, 2007 1:56 pm

m_b_metcalf wrote:
JPF wrote:What are the distances between the 3 stars ?
Which is the closest to #1 ? etc...

JPF

Hypothesis: first you must put them into canonical form. Then count the number of differences. That is the distance between them (the values themselves, being only symbols, having no numereical relevance).

Mike,

The number of differences of 2 puzzles depends on their canonical forms.

Saying differently if d(A,B) is the number of differences of 2 puzzles A,B and if A'~A (isomorphic) and B'~B then d(A',B') can be different of d(A,B).

[Edit in blue]

JPF

by **ronk** » Thu Apr 19, 2007 2:21 pm

JPF wrote:The number of differences of 2 puzzles depend on their canonical forms.

Saying differently if d(A,B) is the number of differences of 2 puzzles A,B and if A'~A (isomorphic) and B'~B then d(A',B') can be different of d(A,B).

Would you please give an example of d(A,B) <> d(A',B')?

by **JPF** » Thu Apr 19, 2007 5:06 pm

ronk wrote:
JPF wrote:The number of coincidences of 2 puzzles depend on their canonical forms.

Saying differently if d(A,B) is the number of coincidences of 2 puzzles A,B and if A'~A (isomorphic) and B'~B then d(A',B') can be different of d(A,B).

Would you please give an example of d(A,B) <> d(A',B')?

Oops! there is a typo in my last post.

Replace coincidences by differences :
Saying differently if d(A,B) is the number of differences of 2 puzzles A,B and if A'~A (isomorphic) and B'~B then d(A',B') can be different of d(A,B).

Example :

Code: Select all: A : 500000009020100070008000300040600000000050000000207010003000800060004020900000005 m_b_metcalf B : 500000009020100070008000300040002000000050000000706010003000800060004020900000005 StrmCkr * * * *

d(A,B)= 4

using the gsf's canonical forms for these puzzles :

Code: Select all: A' : 000006000007180000000020005008500001300000900060000040002070008040000060900000300 B' : 100050000000089200000700000070000003006000040500002900004000060800010500030000007 * ** ** ** ** * *** ** * ** ** *** * * **** * ** ** * *

d(A',B')= 36

JPF

by **ravel** » Thu Apr 19, 2007 5:52 pm

A possible definition for the distance between 2 puzzles A and B could be the minimum number of operations to get from A to a puzzle, that is equivalent to B.
In a simple form the operations can be restricted to remove and add givens. But in this case a pair of puzzles with the same pattern can have the same distance as a pair with different (non equivalent) patterns.

[Added:]E.g. the 18 17-clues with 14 common givens would be a family of 17-clues with maximum (pairwise) distance 6 then.

by **ronk** » Thu Apr 19, 2007 6:49 pm

JPF wrote:Saying differently if d(A,B) is the number of differences of 2 puzzles A,B and if A'~A (isomorphic) and B'~B then d(A',B') can be different of d(A,B).

So ... d(A,B) = d(A',B'), if A and B are from the same grid

... otherwise, it is indeterminate :?:

by **coloin** » Fri Apr 20, 2007 7:24 pm

JPF wrote:A : 500000009020100070008000300040600000000050000000207010003000800060004020900000005 m_b_metcalf
B : 500000009020100070008000300040002000000050000000706010003000800060004020900000005 StrmCkr
* * * *
d(A,B)= 4

using the gsf's canonical forms for these puzzles :
Code:

A' : 000006000007180000000020005008500001300000900060000040002070008040000060900000300
B' : 100050000000089200000700000070000003006000040500002900004000060800010500030000007
* ** ** ** ** * *** ** * ** ** *** * * **** * ** ** * *
d(A',B')= 36

I am not sure that canonicalizing the puzzles is helpful at all !

The two original puzzles surely are near "neighbours" !

Code: Select all: 20 clue puzzle 17 common clues 2 incorrect clues 1+1 mismatched clues 60 common empty cells

In sudoku space things are possibly not as bad as they seem.............

Take the first pair of random 27 clue minimal puzzles

Code: Select all: puzzle1# 5..2....4.4....972..7...5.......2..34.135.86....6.....1.3.4..2.9....5..88.....7.. puzzle2# .7.6..4.....7....2..3.9817..16.7.5.3..5.......8.56..1.6...542.88..2..............

With clue substitution and row swapping and a 90 rotation I was able to make an isomorph of puzzle 1#

Code: Select all: .x...............x....x.x....x..................x..........x..xx................. Puzzle2 # .7.6..4.....7....2..3.9817..16.7.5.3..5.......8.56..1.6...542.88..2.............. Puzzle1'# 973.2..6.........24...9.1....6..8...2...6.......54..8...7..4.288...5...1.1..7.6.3

Code: Select all: 27 clue puzzle 9 common clues 2 incorrect clues 16+16 mismatched clues 38 common empty cells

I dont think this degree of conformaton will be unususal between two random puzzles - there possibly are transformations which match up even better than this [manual} demonstration.

Two puzzles with the same pattern will have

Code: Select all: z clue puzzle x common clues y incorrext clues 0 mismatched clues [81-z] common empty cells

The sudoku space contains all puzzles from all grids doesnt it ?

C

by **JPF** » Mon Apr 23, 2007 9:59 pm

Let S be a set of puzzles.

Let A and B be 2 puzzles of S :
A={a1,a2,...,a81} ; ai=0,1,2,...,9
B={b1,b2,...,b81} ; bi=0,1,2,...,9

The distance (number of differences) between A and B is :

d(A,B)= sum[f(ai,bi)] ; i=1,...,81

The function f(x,y) is defined as :
f(x,y)=0 if x=y
f(x,y)=1 if x<>y.

It's easy to prove that for any numbers a, b, c :
f(a,b)<=f(a,c)+f(c,b)

This property of f gives S the property of a metric space
with the distance d :
d(A,B)>=0
d(A,B)=0 iff A=B
d(A,B)=d(B,A)
d(A,B)<=d(A,C)+d(C,B)

As mentionned before, if A~A' and B~B' it can happen that d(A,B)<>d(A',B') but it doesn't matter.

Mutable cells and untouchables.

When S is the set of the puzzles with a given pattern,
The definitions of mutable cells and untouchable puzzles are given here .
If a puzzle A has a mutable cell, then it exists a puzzle B (of S) such that d(A,B)=1.
B is a neighbour of A.

a chameleon is a puzzle having n clues and n neighbours.

If a puzzle is untouchable, then d(A,B)>1 for every puzzle B<>A.
The geometric interpretation is that A is "isolated" and has no neighbour.

The distance d gives a formal definition of untouchables level p.
A puzzle A is untouchable level p if d(A,B)>p for every B of S.

JPF

by **coloin** » Tue Apr 24, 2007 7:48 pm

Excellent reference to the chameleon puzzle !

We are dealing with puzzles of the same pattern ...many more distant neighbours are found when you start substituting clues from other positions in the grid - as in the 1off-1on search by gsf when finding more 17 clue sudokus.

This one from the rare patterns found by Ocean with 17 clues

Code: Select all: B:532211111 000100800025000000000060000000702000100000400000500000800090070000000603000000052 +---+---+---+ |...|1..|8..| |.25|...|...| |...|.6.|...| +---+---+---+ |...|7.2|...| |1..|...|4..| |...|5..|...| +---+---+---+ |8..|.9.|.7.| |...|...|6.3| |...|...|.52| +---+---+---+

- there appparently is only one puzzle of this pattern therefore pretty much untouchable.

I cant see this happening very easily with puzzles with "normal" clue counts.

C

by **JPF** » Tue Apr 24, 2007 10:00 pm

coloin wrote:We are dealing with puzzles of the same pattern

no, not necessarily.

coloin wrote:...many more distant neighbours are found when you start substituting clues from other positions in the grid - as in the 1off-1on search by gsf when finding more 17 clue sudokus.

In that case, d=2 ; S is the set of puzzles with the same number of clues (17)

JPF

by **re'born** » Tue Apr 24, 2007 10:48 pm

JPF wrote:As mentionned before, if A~A' and B~B' it can happen that d(A,B)<>d(A',B') but it doesn't matter.
JPF

Let f be an isomorphism of the grid sending A to A'. So we may write A' = f(A). Is it the case that d(A,B) = d(f(A), f(B))?

by **Mauricio** » Wed Apr 25, 2007 2:52 am

JPF wrote:Let A and B be 2 puzzles of S :
A={a1,a2,...,a81} ; ai=0,1,2,...,9
B={b1,b2,...,b81} ; bi=0,1,2,...,9

The distance (number of differences) between A and B is :

d(A,B)= sum[f(ai,bi)] ; i=1,...,81

The function f(x,y) is defined as :
f(x,y)=0 if x=y
f(x,y)=1 if x<>y.

It's easy to prove that for any numbers a, b, c :
f(a,b)<=f(a,c)+f(c,b)

This property of f gives S the property of a metric space
with the distance d :
d(A,B)>=0
d(A,B)=0 iff A=B
d(A,B)=d(B,A)
d(A,B)<=d(A,C)+d(C,B)

As mentionned before, if A~A' and B~B' it can happen that d(A,B)<>d(A',B') but it doesn't matter.

I would think that a better definition of distance is this:

D(A,B)=min{d(A',B'):A~A', B~B'}, using the d function defined by JPF.

In this new metric it is trivial that D(A,B)=D(A',B') if A~A' and B~B'. We would want this, as properties of sudokus besides aesthetics should be independent of isomorphism.

The New Sudoku Players' Forum

Sudoku space

Sudoku space