The New Sudoku Players' Forum

by **DrZ** » Tue Jan 24, 2006 4:09 pm

I just discovered these fantastic puzzles last week, they are truly amazing. I have a doctorate in Communications Theory, and the similarity to erasure correcting codes is striking. I was floored by how few clues are required to solve the puzzles, due to the high level of mutual information encoded in the permuations. In particular, it struck me that that the hardest puzzles probably approach the Singleton bound (more on this in a minute).

This afternoon I started analyzing the mathematics, in particular, trying to figure out how to create the puzzles, and determine the cardinality.
I coded up a recursive puzzle solver, and quickly realized that determining the number of Sudoku 3x3 grids is not exactly trivial.

Naturally, I looked up the Wikipedia article to learn more, and was pleasantly surprised to find the in-depth analysis posted there (which apparently is gleaned from this forum). The analysis of the cardinality is impressive, and it's great to hear that the results have been confirmed independently. However, I was very surprised to read that the minimum number of required clues hadn't been solved yet.

I think I've got a solution, and it's remarkably simple and satisfying. But since I'm new to this, I may have missed something. I initially conjectured when I was figuring out how to encode the puzzles that a reasonable strategy is to start with the 3 matricies on the diagonal (or the anti-diagonal), with 9!^3 possible permuations, which would seem to be an independent basis for forming (in some way) a valid grid. Does anyone know of a counter-example?

It is not surprising that very few of these permuations would have a unique solution without filling in more of the squares. There are probably at least a few permutations that are uniquely determined by *only* these 3*9=27 squares, but this isn't that important, so long as
there is at least one solution.

Here's where the theory of Erasure-Correcting Codes comes in.
It boils down to "intrinsic" vs. "extrinsic" information. These terms come from Turbo codes, which in fact share a lot of parallels to these Sudoku puzzles, with one of the key differences being that the measurements are typically assumed to be corrupted by additive Gaussian noise rather than being erased. The transmitter sends an uncoded bit-stream along with two convolutionally-coded bit-streams, where the second bit-stream first passes through an interleaver to scramble it up. The decoding is accomplished iteratively using two MAP (Maximum Aposteriori Probability) decoders that operate in parallel, given an estimate of the noise variance, to make "soft-decisions" (typically a log-likelihood ratio) about how likely it is that each bit is a 0 or a 1. Each decoder ideally has an independent sequence of noise samples due to the way that the interleavers work. The details are somewhat complicated, but in essence, you can think of the two coded bit streams as providing an indepdendent "row" and "column". The trick to making the iterative feedback work is that the two MAP decoders (the "rows" and "columns") need to separate out the "intrinsic" information and only exchange "extrinsic" information.

In Sudoku, the only "intrinsic" information seems to be contained in
the 27 entries along either the diagonal (or anti-diagonal). In other words, these are the only cells where you have complete freedom to choose any of the 9! permutations without affecting eachother.
Making any further entries in any of the other 81-27=54 squares creates mutual information between the rows and columns, and is thus "extrinsic".

You can think of Sudoku as a non-binary rate (N,K) Erasure-Correcting Block code that uses a base B=9 alphabet. There are N=81 total grid positions (9^81 codewords, most are *not* Sudoku grids), and of them, only K=27 convey the intrinsic information. The "intrinsic information" consists of the 9!^3 possible permuations along either the diagonal or anti-diagonal, since these are the only permutations that can be chosen independently. Since we have a base-9 alphabet, the total amount of intrinisic information is K = ceil[Log_9(9!^3)]=18. You can think of the remaining 54 grid positions as containing parity symbols with "extrinsic" mutual information.

The theory of Error (or Erasure) Correcting codes tends to be rather arcane. But the most fundamental bounds (most originally derived by Shannon) are extremely simple and elegant, somewhat akin to Maxwell's equations. For *any* rate (N,K) linear block code, the minimum distance is *upper* bounded by the Singleton bound:

Dmin <= N-K+1

For a binary code, B=2 and the "distance" between any pair of the 2^K codewords is measured by the Hamming weight, which is the total number of bits that differ. The minimum distance is the smallest possible distance between any pair of 2^K codewords. And the Singleton bound is the largest possible value of the minimum distance. Codes that yield equality, i.e. Dmin = N-K+1, are called MDS (Minimum-Distance Seperable). Reed-Solomon codes are MDS, and in fact also have some interesting connections to Latin squares. In Sudoku, we aren't that lucky, otherwise every puzzle would be soluble given only 17 clues, which is clearly an exceptional circumstance.

The minimum distance corresponds to the number of "Erasures" that a code can correct. The number of "Errors" that can be corrected is *half* the minimum distance, because some of the information is unreliable. Imagine how much harder Sudoku would be if some of the clues were wrong, and we had to figure out which ones had to be fixed! We are much more lucky in Sudoku, because the missing numbers are simply erased, and we have complete confidence that every clue is correct.

Of the N=81 total positions, we wish to find Cmin, the minimum possible number of clues that allow us to still solve the puzzle. Since up to Dmin "Erasures" can be corrected, this means that

Dmin=N-Cmin,

i.e., we can fix up to 81-Cmin erasures in the best case.
Substituting this into the left-hand side of the Singleton bound,

N-Cmin <= N-K+1
-Cmin <=-K+1
+Cmin >= K-1,

where we multiplied by -1 and thus had to reverse the inequality.
Substituting for K,

Cmin >= Ceil[ Log_9(9!^3) ]-1
Cmin >= Ceil[ 17.479 ] -1

Therefore, Cmin >= 17.
Q.E.D.

Dr. Brandon Zeidler

by **frazer** » Tue Jan 24, 2006 6:01 pm

What an excellent argument. The idea of using information-theory had vaguely crossed my mind too, but I didn't really think about it hard at all. Your argument seems nice.

Just a few points (asked by a number theorist with just an undergraduate knowledge of coding theory, not a specialist, so I hope that they aren't too embarrassing!):

>I initially conjectured when I was figuring out how to encode the puzzles that a reasonable strategy is to start with the 3 matricies on the diagonal (or the anti-diagonal), with 9!^3 possible permuations, which would seem to be an independent basis for forming (in some way) a valid grid. Does anyone know of a counter-example?

Clearly they aren't completely independent in the probabilistic sense, otherwise the total number of grids would be divisible by (9!)^3, and it isn't. But that probably isn't quite what you meant, nor is it really what you need, if I've understood the argument. But it is surely true that any choice of the 27 elements of the leading block can be completed to a sudoku in many ways (I'd be extremely surprised if there were any sudoku grids uniquely determined by their diagonal blocks). In any case, it would be easy to check computationally (not by me -- I'm rubbish at computing, as my earlier posts proved). Without loss of generality, the top left diagonal block is 123/456/789, and the other two (after permuting rows and columns) are 1ab/cde/fgh, with a<b and c<f, only (8!/4)^2 possibilities for each; relabelling and permuting blocks if necessary should reduce this by a factor of 6. So only around 20,000,000 to check.

> In Sudoku, the only "intrinsic" information seems to be contained in
the 27 entries along either the diagonal (or anti-diagonal). In other words, these are the only cells where you have complete freedom to choose any of the 9! permutations without affecting eachother.

(Or shifts of the diagonal, like blocks 2,6, 7 etc.)

> For *any* rate (N,K) linear block code, the minimum distance is *upper* bounded by the Singleton bound:

In what sense is a sudoku grid a *linear* code? The proof of the Singleton bound I know uses the parity check matrix, so the linearity condition is presumably essential to that proof. But this would require some underlying field (Z/9Z is not a field, although there is of course a field with 9 elements), and some vector space structure on the sudoku grids...

If this last point demolishes your argument, I apologise! It is, as you say, very simple and satisfying, so I hope that your argument will be fine! I like the idea of using information theory very much.

Best wishes,
Frazer

by **Moschopulus** » Tue Jan 24, 2006 8:30 pm

Yes, the information theory approach has been in the air. Well done on the nice argument. I'm not yet convinced, but I'm sure you can fill in the details. A summary of the proof might help us understand.

DrZ wrote:It is not surprising that very few of these permuations would have a unique solution without filling in more of the squares. There are probably at least a few permutations that are uniquely determined by *only* these 3*9=27 squares, but this isn't that important, so long as
there is at least one solution.

None are uniquely determined by these 3 boxes. Such a puzzle has 6 empty boxes, and you can't even have 5 empty boxes.
But this isn't important to your argument.

DrZ wrote:Since we have a base-9 alphabet, the total amount of intrinisic information is K = ceil[Log_9(9!^3)]=18. You can think of the remaining 54 grid positions as containing parity symbols with "extrinsic" mutual information.

Unfortunately I don't know anything about turbo codes. You are fixing a completed sudoku grid here, am I correct? You are claiming that in any completed sudoku grid, the amount of intrinsic information (whatever that is) is 18.
I'm not fully convinced, but leave that for now.

DrZ wrote:The minimum distance corresponds to the number of "Erasures" that a code can correct.

Do you mean the distance between two different completed sudoku grids?
Could you define this "distance" please. Do you mean the number of places that the two grids differ? It is possible for two grids to differ in 4 places.

DrZ wrote:Of the N=81 total positions, we wish to find Cmin, the minimum possible number of clues that allow us to still solve the puzzle. Since up to Dmin "Erasures" can be corrected, this means that

Dmin=N-Cmin,

as I said above, please could you explain what Dmin is.

An error-correcting code with minimum distance d can correct d-1 erasures.

I understand that Cmin is the minimum number of clues in our given completed grid (which is arbitrary) that uniquely determine the grid.
At least I think you are fixing a grid but I'm not sure.

Thank you very much.

[Frazer, the Singleton bound does hold for nonlinear codes. In an (n,M,d) code you can puncture (remove) d-1 coordinates and still have distinct codewords left. So M <= q^{n-(d-1)}. But I think there's something else bothering you...]

by **Red Ed** » Tue Jan 24, 2006 10:14 pm

frazer wrote:Without loss of generality, the top left diagonal block is 123/456/789, and the other two (after permuting rows and columns) are 1ab/cde/fgh, with a<b and c<f, only (8!/4)^2 possibilities for each; relabelling and permuting blocks if necessary should reduce this by a factor of 6. So only around 20,000,000 to check.

The code is running now: will take a couple of hours to finish. I'll be very surprised if it finds a contradiction to DrZ's assertion that any B1/5/9 arrangement is contained in at least one solution grid (or, as Moschopulus points out, many).

EDIT: okay, finished now, with the B1/5/9 claim confirmed as expected.

by **Wolfgang** » Tue Jan 24, 2006 10:21 pm

Sorry, but for me it looks like nice rubbish, with a "result", we all expected.

by **Red Ed** » Tue Jan 24, 2006 10:59 pm

Wolfgang wrote:Sorry, but for me it looks like nice rubbish, with a "result", we all expected.

I agree. I think this is a red herring dressed up in (or disguised by) communications theory jargon. If you delete all the mumbo-jumbo about intrinsic vs. extrinsic information and reduce what's left down to a bare-bones counting argument then it just doesn't hold together.

by **gfroyle** » Wed Jan 25, 2006 3:53 am

I think the answer is 17, but this argument does not convince me.

The main issue I have is that I cannot see where the argument uses the defining conditions for being a Sudoku.. we already know that by adding an minor extra condition (like Sudoku X) we can find 13-clue uniquely completable puzzles.

As far as I can see, the argument given here would proceed unchanged for any variant of Sudoku that allows the three main diagonal boxes to be freely chosen. In particular, we can impose a condition on B2/B6/B7 that converts the puzzle to one equivalent to Sudoku X.

Cheers

Gordon

by **coloin** » Wed Jan 25, 2006 1:23 pm

Without knowing anything about Communication theory or erasure correcting theory, if the theory exists and the similarities to sudoku grids exist then this maybe valid.

The diagonal boxes with 27 clues

Code: Select all: xxx...... xxx...... xxx...... ...xxx... ...xxx... ...xxx... ......xxx ......xxx ......xxx

To go along with Gordon's counter

The diagonal boxes with "free" clues in Sudoku X

Code: Select all: .xx...... x.x...... xx....... ....x.... ...x.x... ....x.... .......xx ......x.x ......xx.

number of ways to fill...is this what we count ?.....? approx.[9!/3!]*[6!/3!]*[4!] [not sure about this][probably it needs to be crunched]

Log_9 and subtract 1
Is this less than 13 ?...... probably !

We could even have a provisional answer for the 4*4
Log_16 [1.9*e^53] - 1

by **Ocean** » Wed Jan 25, 2006 1:35 pm

Interesting argument, and interesting discussion. One crucial point is how the sudoku rules apply...

Are the sudoku rules simply 'covered' (implicitly) by this condition:

DrZ wrote:... with 9!^3 possible permuations, which would seem to be an independent basis for forming (in some way) a valid grid. Does anyone know of a counter-example?
... so long as there is at least one solution...

Conjecture 1: For a normal 3x3 sudoku: All 9!^3 choices of clues in B1/B5/B9 result in valid sudoku grids.

Proof - as shown by

Red Ed wrote:"EDIT: okay, finished now, with the B1/5/9 claim confirmed as expected.

While for sudoku-x:

gfroyle wrote:"As far as I can see, the argument given here would proceed unchanged for any variant of Sudoku that allows the three main diagonal boxes to be freely chosen. In particular, we can impose a condition on B2/B6/B7 that converts the puzzle to one equivalent to Sudoku X.

Conjecture 2: If we impose a suduoku-x type condition on B2/B6/B7, then the the B1/B5/B9-clues cannot be picked 'freely' in 9!^3 ways - some choices do not result in a valid 'sudoku-x' grid.

A similar argument for 2x2-grids:

Conjecture 3: For for a 2x2-sudoku: The 4!^2 choices of clues in B1/B4 do not always result in a valid grid.

Proof by counter-example: 1200340000130024 is not a valid choice, no grid can have this combination.

by **coloin** » Wed Jan 25, 2006 5:29 pm

I was beginning to think Dr. Z was spoofing...
Why the Log ? Why -1.........

If Dr Z comes back he will explain.......

I think it works for the 2*2 ?

Code: Select all: xx.. xx.. ...x ..x.

4!*4!/2! = 288 ways.......i've not seen this number [number of grids]worked out this way either !

What is the minimum number of clues for a 2*2 ?

I havnt seen it anywhere but

Code: Select all: 1... ..2. .3.. ...4 4 clues gives 1243 3421 4312 2134

and log_4 of 288 .......im getting lost here...[log_10 10000 is 4 ? right]

if its 4.1 then, 4.1 - 1 is 3.1

therefore Cmin >= 4

"Q.E.D." again

by **Wolfgang** » Wed Jan 25, 2006 9:00 pm

Since the discussion is continued (what i did not expect), just one point, why i think its rubbish, without understanding anything about communication theory: DrZ says we can think of the "extrinsic" 54 cells as containing parity symbols, ie redundant information. But this would only be the case, if the diagonal boxes would define a grid, which we all know does not. So before this (and some other points) is clarified, i will not waste another thought on this "proof".

by **frazer** » Wed Jan 25, 2006 9:24 pm

> I was beginning to think Dr. Z was spoofing...

I think that you are doing him an injustice! Recall that the total number of grids is about 6.7x10^21. Imagine a blank grid, but with just one square filled in. You would think that the number of sudoku grids with that square correct is 1/9 of the total number. Now imagine filling in 2 squares. Again, you would imagine that the total number of sudoku grids with those 2 squares would be 1/9^2 of the total... Now imagine filling in 23 squares. Then you would imagine that the number of solutions is 1/9^23 of 6.7x10^21, which is just under 1. That is, if you fill in 23 squares, you expect that to be enough information for there to be (about) 1 solution, on average. This is because 9^23 is of the same order as 6.7x10^21. Equivalently, log_9(6.7x10^21) is about 23. If you like, the "information" contained in a complete sudoku grid is, on average, (roughly) the same as that got by specifying 23 squares.

DrZ has specified the three diagonal blocks, with (9!)^3 possible arrangements, and has pointed out that the information contained in specifying these is log_9((9!)^3) for the same reasons; this number is about 17.5. So the information contained in the diagonal blocks is equivalent to specifying 17.5 random squares on a grid -- just because (9!)^3 is about 9^(17.5).

It's a good idea to check the method on the 2x2 case!

by **kjellfp** » Wed Jan 25, 2006 10:12 pm

The math used here is not my field, so I will be careful about judging whether this holds or not. I'd like someone who can follow DrZ's post, to proofread it. If it holds, it's a remarkable result as I up to now was sure that the only way to prove it was by finding all 17s and prove the list to be complete.

Btw, can this theory be used to prove the non-existence of symmetrical 17s? I doubt it, but would like to here any thought on this.

by **Red Ed** » Wed Jan 25, 2006 10:53 pm

I'll only be happy when the argument is redone in basic jargon-free terms. The Singleton bound, for example, is pretty trivial and can be derived rather than quoted (which will lose some readers) at the cost of very little space. And I am sure that the "intrinsic" vs. "extrinsic" stuff is just context -- not to mention the digression into Turbo coding! -- that adds nothing to the business of actually proving the claim. If DrZ's aim is to prove something, he should get on with doing just that and stop peppering the discussion with tidbits from his doctorate that frankly I think we (or at least I) could do without.

Don't get me wrong -- I do like the spirit of his argument and appreciate the effort to settle a longstanding question!

by **Moschopulus** » Thu Jan 26, 2006 1:23 am

frazer wrote:It's a good idea to check the method on the 2x2 case!

Dr Brandon Zeidler's argument, if correct, would work for the nxn case. His argument would give a lower bound for the minimum number of clues.

[Edited]

In the 2x2 case we need log_4 (4!^2)=4.585, so the formula gives 4 for the lower bound, which is the correct answer.

In the 3x3 case we need log_9 (9!^3)=17.48, so the formula gives 17 for the lower bound, which we all believe is correct.

In the 4x4 case we need log_16 (16!^4)=44.25, so the formula gives 44 for the lower bound. It's hard to believe this is the correct answer. There were posts a while back looking for the minimum number of clues in 16x16 sudoku, and a puzzle with 85 clues was already hard enough to find. I don't believe one was found with fewer than 80 clues. (What is the lowest known?)

I hope Dr Brandon Zeidler replies to our questions about his proof.

The New Sudoku Players' Forum

Proof of Minimum Clue >= 17

Proof of Minimum Clue >= 17

Re: Proof of Minimum Clue >= 17

Re: Proof of Minimum Clue >= 17