## Su-Doku's maths

Everything about Sudoku that doesn't fit in one of the other sections
To recap:

[steal shamelessly from tbb, with alterations]
How many possible grids are there?

Here's my attempt.

First we consider block 5; then blocks 4 and 6; then blocks 2 and 8; then blocks 1, 3, 5 and 7.

1 - Block 5
In filling in block 5, there are no constraints, and clearly we can fill it in any of 9! ways. For the rest of this post, I will assume this block is filled out as below, as we can go from any grid to one looking like this by relabelling:

123
456
789

How many ways are there of filling blocks 4 and 6? Well, let's consider the rows and columns of blocks 4 and 6 separately - rows first.

2 - Rows of Blocks 4 and 6
In block 4, all numbers must be shifted up a row or down a row (ordering the rows cyclicly, up from row 4 is row 6, and down from row 6 is row 4) as numbers can't occupy a row in block 4 if they occupy that row in block 5. And whatever shift applies in block 4, the opposite shift applies in block 6 (if number 5 is shifted up to row 4 in block 4, it must be shifted down to 6 row in block 6) by applying the same constraint to blocks 4 and 6.

Let's consider then what happens to numbers 4, 5, and 6. We only need count what happens in block 4 as we know the opposite happens in block 6. For each, there are two choices - shift up or shift down - so 2x2x2=8 choices in total. In 2 of these choices, 4, 5 and 6 are all shifted in the same direction (up or down); in the other 6, one is shifted one way and two the other.

Suppose 4, 5 and 6 are all shifted to row 4; where can 7, 8 and 9 go? Well, they can't remain in row 6, and because row 4 is occupied, they can't go there. So they are forced up into row 5, and 1, 2 and 3 are shifted up into row 6. So all the row choices are forced. By symmetry a similar argument applies if 4, 5 and 6 are shifted down.

Now suppose 4 and 6 are all shifted up and 5 is shifted down. Where can 7, 8 and 9 go? Well, they can't remain in row 6, and 2 cells in row 4 are occupied. One could go to row 4, but the other two have to go to row 5. What about 1, 2 and 3? They can't remain in row 4, and two cells in row 5 are occupied, so two of them must go to row 6, which along with number 5 is now fully occupied. So of 7, 8 and 9, exactly two shift up and one down, and of 1, 2 and 3, exactly two shift up and one down. So we need to choose one of 1, 2 and 3, and one of 7, 8 and 9 to move down, and the rest of those numbers move up.

To recap - with block 5 fixed, there are 2+(6x9)=56 ways of assigning numbers to rows in block 4 - just work out what happens to numbers 4, 5 and 6 (up or down, for 8 choices) - in 2 of those 8, they are all shifted the same way and indeed all 9 numbers are shifted that way. In the other 6 choices, one of 4, 5 and 6 move opposite to the others, and we must also pick an opposite mover out of each of {1,2,3} and {7,8,9}. And having made these choices, the assignment of numbers to rows in block 6 are forced.

3 - Columns of Blocks 4 and 6
In block 4, we have assigned 3 numbers to row 4, 3 to row 5, and 3 to row 6. Within each row, there is no constraint - we can assign those numbers to columns arbitrarily. So for each row there are 3! choices to be made, ie (3!)^3 in total. The same applies to block 6.

Taking steps 1, 2 and 3 together, there are 9! x 56 x (3!)^6 ways of populating blocks 4, 5 and 6.

4 - Blocks 2 and 8
By the same argument, all numbers in block 2 are shifted left or right in block 2, relative to their position in block 5, and the opposite applies to block 8 - we simply consider what happens to numbers 2, 5 and 8, and what that forces on each of {1,4,7} and {3,6,9} and come up with 56 ways of assigning numbers to columns in block 2 (which forces assignments to columns in block 8). Numbers can be assigned to rows freely.

So we have 9! x 56^2 x (3!)^12 ways of assigning numbers to blocks 2, 4, 5, 6 and 8.

5 - Blocks 1, 3, 7 and 9
In block 1, all numbers are shifted up or down relative to their position in block 2, and the opposite shift applies to block 3. We have 56 choices as before assuming that none of these choices are prevented by one or more column constraints.

In block 7, all numbers are shifted up or down relative to block 8, and the opposite shift applies to 9 - 56 choices again assuming that none of these choices are prevented by one or more column constraints.

In block 1, all numbers are shifted left or right relative to their position in block 4, and the opposite now applies to 7 - 56 choicesassuming that none of these choices are prevented by one or more row constraints.

Finally, in block 3, all numbers are shifted left or right relative to block 6, and opposite applies to 9 - 56 choices assuming that none of these choices are prevented by one or more row constraints.

I think we have determined that the 56*56 choices are reduced to (at best) 8*56 = 448 choices when the mutual constraints are considered when filling block1 and the row/columns of blocks 3 and 7. There are at most then 8 ways to fill the column/rows of blocks 3 and 7 and block 9

So we now have solutions <= 9! * 56^2 * 3!^12 * 8*56 *8 ways.
factoring:
<= 2^7*3^4*5*7 * 2^6*7^2 * 2^12*3^12 * 2^6*7 * 2^3
<= 2 ^34 * 3^16 * 5 * 7^4
<= 8.878*10^21
less the number of invalid solutions caused by my 'fundamental problem'.

Is that about where we are?
Guest

Nearly right. Steps 1-4 were right. Then to fill in block 1, given that blocks 2 and 4 are already filled in, has at most 448 solutions. Now we try block 3; blocks 1 and 2 are already filled in, but there are also column constraints from block 6 -- it turns out that there are at most 8 possible ways to fill in block 3. Similarly for block 7: we have column constraints from blocks 1 and 4, and row constraints from block 8 -- again we have 8 possibilities. Then everything is filled in except block 9, which then has at most 1 solution. So your recap omits one factor of 8 (since blocks 3 and 7 EACH have at most 8 possibilities).

Frazer
Guest

afj wrote:Nearly right. Steps 1-4 were right. Then to fill in block 1, given that blocks 2 and 4 are already filled in, has at most 448 solutions. Now we try block 3; blocks 1 and 2 are already filled in, but there are also column constraints from block 6 -- it turns out that there are at most 8 possible ways to fill in block 3. Similarly for block 7: we have column constraints from blocks 1 and 4, and row constraints from block 8 -- again we have 8 possibilities. Then everything is filled in except block 9, which then has at most 1 solution. So your recap omits one factor of 8 (since blocks 3 and 7 EACH have at most 8 possibilities).

Frazer

The extra 8 in your answer seemd a bit excessive to me , so I did some trial and error:

Conclusion: When blocks 1,2,3,4,5,6 and 8 are filled via tbb's block sequence, then there are at most 2 (not 8) ways left to fill blocks 7 and 9. As soon as any one of the two remaining viable numbers are chosen in any of these 18 cells, it creates a 'cascade' where all 17 other choices become forced. The only two values that seems to apply are 0 or two: either two choices are available, or there is no way to populate these two blocks.

So my answer needs to be doubled, not multiplied by 8.

For an examples of this:

Exactly two valid solutions:
456 312 789
789 645 123
123 978 456

645 123 978
978 456 312
312 789 645

*** 231 ***
*** 564 ***
*** 897 ***

Fill in the 'pencilmarks' and it becomes obvious that there only two solutions.

No valid solutions:
978 312 645
312 645 897
645 978 231

564 123 759
789 456 123
123 789 468

*** 531 ***
*** 294 ***
*** 867 ***

In this sample, r9c3 must be a 1, and r9c8 must be a 1, but they cannot BOTH be a 1
Guest

No, it's definitely 8 (i.e. 2 ^ 3), because mini-col (3 cells within a box) can be resolved 2 ways, independently- here are three to be going with...

5 6 4 2 3 1 8 9 7
8 9 7 5 6 4 2 3 1
2 3 1 8 9 7 5 6 4

5 9 4 2 3 1 8 6 7
8 3 7 5 6 4 2 9 1
2 6 1 8 9 7 5 3 4

5 6 7 2 3 1 8 9 4
8 9 1 5 6 4 2 3 7
2 3 4 8 9 7 5 6 1

Also, in the last example, you have two 5s on row 4, and two 8s on row 6, so it's not surprising it can't be resolved!
Guest

I was thinking briefly about the multiplication by 8 or 2. In the worst case, which is very rare, there may be 8 ways to fill in the final two blocks. This only happens if the columns of the final two blocks are the same; for example, if, say, columns 1 and 7 both consist of the numbers 1, 4 and 7 in some order, similarly for columns 2 and 8 and columns 3 and 9:

123 *** 456
456 *** 789
789 *** 123

I haven't expressed this very clearly! Usually, there will be only 0 or 2 ways, as tinfoil suggests. 4 is also possible; for example, take the sudoku grid on the front page of this site, and copy out blocks 1-7. Then blocks 8 and 9 can be filled in in 4 ways (columns 5 and 7 have the same numbers,
and columns {4,6} and {8,9} also do).

In summary, the idea of filling in the final two blocks simultaneously stands a good chance of cutting down on computations - thanks tinfoil!

Frazer
Guest

Okay. yet another approach, involving conditional probability.

Step A) We have already determined that there are C1=9!*56*3!^6 ways to populate the first chute.

For the sake of further discussion, let's assume that the chutes are vertical (e.g. blocks 1,4,7).

If we ignored row constraints, then there would be C1^3 ways to fill the three vertical chutes. Obviously, we are going to run into row issues, but I'll get to those in a bit. However, these possible arrangements ALL satisfy the block and column constraints.

Step B) Unproved assertion: Given three independently filled chutes that satisfy step A, the likelihood of a particular value 1-9 being in a given box is 1 in 9. That is to say that if you were to lay out all of the possible combinations, then exactly 1/9th of them would have a 1 in cell (1,1), and exactly 1/9th of them would have a 5, say, in cell (2,7), etc.

Now all we have to do is figure out what proportion of the Step A arrangements satisfy the row constraints.

Step C) Lets assume that the first block contains
123
456
789
It doesn't matter what it is, as we all know we can substitute 9! ways.

Each of my candidate chutes in step A have the 1,2, and 3 in block 2 SOMEWHERE. There is a 6 in 9 chance that the 1 did not wind up in the top row. When it did not, there is a 5 in 8 chance that the 2 did not either, and of those, there is a 4 in 7 change that the 3 also missed the top row, for a 6/9 * 5/8 * 4/7 = 120/504 = 0.238095 probability that the 1, 2, and 3 in block 2 happened to miss the top row (or 23.8%). In all other cases, at least one of (123) is in the top row, making that set of three chutes an invalid Soduku solution.

Step D) {The algrebra continues to get worse!} When step C is satisfied, there are two possible cases: digits (123) all happen to be in the same row (either 2 or 3), or one is in one row and the other two are in the other.

Well, 1 is in SOME row. The probability of 2 being in the same row is 2 in 5. The probability of 3 being in that same row is 1 in 4 (that is, there are only 4 places left that satisfy step C). So (123) is in the same row 2/5 * 1/4 = 1/10 = 0.1000 of the time. Therefore (123) is divided between the two rows with a proability of .90000, (given step C).

We now have two resolutions of step C, with respective probabilities of .1 and .9.

Step E(same row) If (123) are in the same row (either 2 or 3), then one of the two remaining rows from block 1 MUST ALL lie in row 1 for this to be a valid Soduku. For example, if (123) is in row 2, then (789) must lie in row 1. This will happen 3/6 * 2/5 * 1/4 = 6/120 = 1/20 = 0.05 = 5% of the time.

Step E(different row) If (123) lies in two different rows, then one of rows 2 or 3 has two of these numbers in it. It doesn't matter which one. For the sake of discussion, row 3 contains (23) and row 2 contains (1). {For those of you who think I'm being careless or cavalier about placing these digits, remember that, in every case, I'm only talking about those layouts that satify the '3 correct chutes' of step A}.

For this to be a valid grid, the remaining cell of row 3 cannot contain any of (789). Given that (123) is now placed, then 3 of the 6 remaining values are good, and 3 bad, for a probability of 0.50 that none of (789) is in row 3. {alternatively, 7 has a 5/6 chance of missing the row; 8 has a 4/5 chance (given that 7 missed), and 9 has a 3/4 chance (given that 7 and 8 missed, for a 5/6 * 4/5 * 3/4 = 60/120 = 1/2 = 0.50 probability}.

In addition, row 2's two vacant spots MUST be filled with two of (789), since none of (456) is allowed to be there. We have already eliminated the ones where one of (789) went to row 3, so there are 5 spots for (789) to go (row 1 is 'empty', row 2 has one spot taken). There are 10 possible ways to arrange three digits in 5 spaces, and exactly 3 of them will have two particular spots filled. Therefore, there is a 0.30 chance that (789) happen to be in good spots, given that they are not in row3.

If you're still with me, then we have a cumulative probability of 0.5 * 0.3 = 0.15 of a good block, for the 90% of the cases where (123) were divided into two separate rows, given that none of (123) is in the top row.

Step F We're getting close...
So to consolidate this, there's a .238095 chance that none of (123) of block2 is not in the top row. Of these, there's a (.100 * .050) = 0.005 likelihood that (123) are all in the same row leading to a valid grid, and a (.900 * .15) = 0.135 likehood of a good block otherwise. We ADD these conditional probabilities to get a 0.140 chance given a valid placement of (123), arriving at .0140*.0238095 = 0.03333333 overall. In other words, for random pairing of valid chutes, there is only a .033333 chance of block2 and block1 NOT violating the 'rows constaint'. This happens to be a nice round 1 in 30 chance (reciprical of .033333).

Step G There's a 1 in 30 chance that, given block1, that block2 satisfies the row constraint. There's similarly a 1 in 30 chance that block3, given block1, satisfies the constraint, AND a 1 in 30 chance that block3, GIVEN BLOCK2, satisfies the row constraint. Overall, there's only a 1 in 30^3 probability that blocks 1,2 and 3 happen to satisfy the row constraints, given three independantly arranged valid vertical chutes.

Step H Finally, there's only a 1 in (30^3)^3 chance that ALL THREE of the horizontal chutes turn out to be valid. Therefore, overall, there must be C1^3 / (30^9) valid solutions to the Soduku puzzle.

or [(9!)*56*3!^6]^3/(30^9) ways.

This evaluates to 4.3299 * 10^22 valid ways.
Guest

tinfoil's new method looks promising. Once again, though, the final answer isn't an integer, so it can't actually be exactly correct, although it may turn out to be a good approximation. So where does the argument go wrong?

I believe Step B should be true; I think the problem is with Step G. The probability of filling in block 3 given blocks 1 and 2, (probably) isn't the product of the probability of filling in block 3 given block 1 and the probability of filling in block 3 given block 2.

Indeed, since you are essentially asking in Steps C-G for the probability of filling in blocks 1, 2 and 3 by inputting random digits from 1-9, we can work this out as the number of valid ways of filling blocks 1, 2 and 3 (9!*56*6^6) divided by the number of ways of filling in the blocks with 1-9 (9!^3). This value is 1/50400, whereas 1/30^3 is 1/27000. (This is exactly your earlier probabilistic method.) In Step H, we would then get C1^3/50400^3, which works out at 832135577110689072807936/125 in lowest terms, or 6657084616885512582463.488, approximately 6.65x10^21, as you said before. (I really like your earlier method!)

Frazer
Guest

### sudoku maths

Keep going with this challenging problem.

One thought

The rotational element will mean that the same grid can be viewed 4 ways.

coloin

Posts: 1737
Joined: 05 May 2005

afj wrote:tinfoil's new method looks promising. Once again, though, the final answer isn't an integer, so it can't actually be exactly correct, although it may turn out to be a good approximation. So where does the argument go wrong?

I believe Step B should be true; I think the problem is with Step G. The probability of filling in block 3 given blocks 1 and 2, (probably) isn't the product of the probability of filling in block 3 given block 1 and the probability of filling in block 3 given block 2.

Indeed, since you are essentially asking in Steps C-G for the probability of filling in blocks 1, 2 and 3 by inputting random digits from 1-9, we can work this out as the number of valid ways of filling blocks 1, 2 and 3 (9!*56*6^6) divided by the number of ways of filling in the blocks with 1-9 (9!^3). This value is 1/50400, whereas 1/30^3 is 1/27000. (This is exactly your earlier probabilistic method.) In Step H, we would then get C1^3/50400^3, which works out at 832135577110689072807936/125 in lowest terms, or 6657084616885512582463.488, approximately 6.65x10^21, as you said before. (I really like your earlier method!)

Frazer

Given your comments, I returned to my probability tree and continued with filling block 3, given that block 2 was filled one of the two valid ways (123 on same row in block2 vs. 123 on two different rows in block 2).

Probability of filling block3 given (block2, 123 same row) is 0.000019132653061
Probability of filling block3 given (block2, 123 same row) is 0.000000708616780
Sum of these two probabilities is 0.000019841269841

The inverse of this number is (drum roll): 50400 (please pretend to be surprised).

This means that afj's answer is now the same as mine.

I cannot find a flaw in my analysis, but I am disturbed because:
1) The answer obviously has to be an integer
2) Furthermore, the answer must be evenly divisible by 9!, since every valid solution has to have this many analogs available through simple digit substitution.

To make these extremely large numbers slightly more manageable, I am going to modify my attempts to try to find the number of solutions, GIVEN that one arbitrary block contains
123
456
789
and then just multiply my final answer by 9! when I'm done to find all of the analogs.
Guest

Oops, bad copy and paste error in previous post:

Probability of filling block3 given (block2, 123 same row) is 0.000019132653061
Probability of filling block3 given (block2, 123 different row) is 0.000000708616780
Sum of these two probabilities is 0.000019841269841
Guest

When tinfoil writes "this means that afj's answer is now the same as mine", I'd love to claim that this was my answer - but it's actually the answer from tinfoil's first attempt at a probabilistic analysis from a couple of weeks ago! But I'm glad that I can help explain that your second method gives the same answer as your first attempt.

(1) I don't think you can expect that the events in Step H are completely independent, so that you can't multiply them to get the overall probability. But I'd be surprised if the actual answer was far different. (As I said at the time, I'd be surprised if the actual answer were more than a few percent different; my feeling is that they should be pretty close to being independent).

(2) Combining the choice of first block with generator methods, the final answer must be divisible by 9!*72^2=1881169920. Indeed, fixing the first block, you can permute columns 4, 5 and 6 in 6 ways, and similarly columns 7, 8 and 9, rows 4, 5 and 6, and rows 7, 8 and 9. You can also exchange the middle three columns 4-6 with the three rightmost columns 7-9 (and similarly for rows). All this gives 6^4*2^2=72^2 ways in which you can adapt a given sudoku grid with prescribed first block into another with the same first block.

Incidentally, it forms the basis of the approach I suggested a week or so ago with an outer loop of about 35000^2; fixing the first block, there are 56*6^6 ways to fill in blocks 2 and 3 - but we can divide this by 72 to account for the symmetries in this situation. (56*6^6/72=36288.) I was going to run through all ways to fill in blocks 2 and 3; 4 and 7, go through all the ways (at most 448) to fill in block 5, and try to extend each to a full grid. It would be easy to write a program to do this, but it's not worth running it unless the program is fully optimised - it could easily take months (and probably years) to run, and I'm not good enough at programming!

Frazer
Guest

### Combinations possible

afj wrote:A quick calculation suggests that the number of ways of filling the top 3 rows is 948019639680.
<snip>
All this gives a total of
9! * (6^6) * (2 + 18*3) = 948019639680 possibilities,
if I've got this right.

Frazer

Once an agreed result is arrived at, another question raises its quizzical head. What constitutes a unique solution?

For example, is a solution different from the same solution rotated through 90 degrees, 180 degrees, 270 degrees?

Is it different from the same solution rotated 180 degrees about the middle column 5, or rotated about the middle row 5?

Is it different from the same solution with a one-for-one transposition of any or all of the digits (e.g. swap 1 and 9; e.g. increment each value by 1, with the 9 swapping to 1 etc etc)

If we accept that such variants are really just the same solution, then we would need to divide the above result by a non-uniqueness factor.

New puzzle - what is that non-uniqueness factor

Terry NZ
Guest

### All the calculation

All the calculation make me faint...anyway, may I ask where can I find the steps to generate/create a new Sudoku puzzle using hand? (without using software)

Thank you!
Guest

IJ wrote:Well, it certainly can't be both, or others, it has to be something!

Thinking a bit harder about it, the possibilities are that the first three cells on row 2 include 0,1,2 or 3 of the top 3 in box 2. This gives four possibilities for the matrix on the second line:

a) 6 5 4 3 2 1
b) 5 4 3 4 3 2
c) 4 3 2 5 4 3
d) 3 2 1 6 5 4

the middle two multiply out to 1440, and the other two to 720, so to get the highest number of prossibilities, I think the first three boxes looks like this

9 8 7 6 5 4 3 2 1
5 4 3 4 3 2 3 2 1
3 2 1 3 2 1 3 2 1

I still think I'm missing a trick though. I'm half wondering if you actually need to add together the possibilities from four boxes, each with the second line starting with one of the above options. This would also need to be done for the other points of conflict (e.g. where I put 6 6 6 on row 4).

However you do it - it's a big number!

I don't think this method can work (at least not trivially). Given B1 filled there are 12096 ways to fill B2 and 2612736 ways to fill B2 and B3 (done by exhaustive search)

(As a second check, given B1 and B2 filled there must be 6*6*6=216 ways to fill B3 - 2612736 / 216 = 12096)

Now the prime factorization of 12096 is 2^6*3^3*7 - note there is no factor of 5 in this but there is a factor of 7.

We can use these figures to put an upper bound on the number though.
N(B1)*N(B2)*N(B5)*N(B6)*N(B9)*N(B4)*N(B8)*N(B3)*N(B7)=
9! * 12096 * 12096 * 12096 * 12096 * 216 * 216 * 1 * 1 = 362441273585989018378567680
(3.6*10^26)

I _think_ a lower bound is:
N(B1)*N(B2)*N(B3)*N(B4)*N(B7)=
9!*12096*216*12096*216=2477160187538964480 (2.5*10^18)

Assuming this lower bound is correct (which I haven't yet proved), we can improve it a bit by realizing that B1, B4 and B9 can be arbitrarily filled. Given that each of these have 9! ways of being filled, the final number must be a multiple of 5^3. But the lower bound above is only a multiple of 5 so we can guarantee that there is another factor of 25 in the remaining boxes somewhere pushing the lower bound to 6.2*10^19.

Tim.
Guest

Posts: 312
Joined: 25 November 2005

afj wrote:A quick calculation suggests that the number of ways of filling the top 3 rows is 948019639680. Here's how I've got this (I hope it's right!):

<snip>

9! * (6^6) * (2 + 18*3) = 948019639680 possibilities,

if I've got this right.

Frazer

Apart from a typo, I agree with this:
948019639680 should be 948109639680 and this agrees with my exhaustive search to fill b2 and b3.
Guest

Posts: 312
Joined: 25 November 2005

PreviousNext