The idea is to break a band into two subbands (by distributing the boxes) with specifications on which symbols to place in which rows within a subband, and count the parts separately. And so move on by breaking into smaller subbands etc. until we are down to one-box "subbands".

Some definitions:

- N is the set of all non-negative integers, i.e. {0,1,2,..}.

- For any R>=1, N_R is the set {1,2,3,..,R} (a bit inconsistent as 0 lies in N but not N_R, but we can handle that. It's not my fault that the majority of this world believes numbering starts at 1 and not 0).

- For any 1<=L<=R, we define E(L,R) to be the family of all L-size subsets of N_R.

- For any 1<=L<=R and C>=1, we define A(L,R,C) to be the set of functions

a:E(L,R) -> N

with the property that for any x in N_R, Sum_U a(U)=C*L, where the sum is over all U such that x lies in U.

We define an RxC L-subband configuration to be an RxL matrix where the entries are elements of E(C,R*C) with the property that any two entries on the same row or same column are disjoint.

This is connected to RxC L-subbands (bands with L boxes of size RxC following the rules of Sudoku): For an RxC L-subband, we can create its RxC L-subband configuration by letting entry number (i,j) in the RxL-matrix be the set of symbols in row i and box j. So the number of RxC L-subbands is the number of RxC L-subband configurations times C!^(RL), and finally the number of RxC bands is the number of RxC R-subband configurations times C!^(R^2).

If B is an RxC L-subband, we can define a mapping F(B):N_(R*C) -> E(L,R) by mapping a symbol to the set of rows in which it occurs (each row is represented by a number in N_R). For such a map F, this again defines a map a(F):E(L,R)->N by sending a set U to the number of elements x in N_(R*C) such that F(x)=U. Then a(F(B)) lies in A(L,R,C) because every row of B contains C*L different symbols.

For any mapping F:N_(R*C) -> E(L,R), we define Z(L,R,C,F) to be the family of RxC L-subbands B such that F(B)=F. For any a in A(L,R,C), Z(L,R,C,F) is the same for every F such that a(F)=a, and so we define Z(L,R,C,a)=Z(L,R,C,F) for any F such that a(F)=a. Finally we define N(L,R,C,a) to be the size of Z(L,R,C,a).

For the case L=R, there is only one element of A(R,R,C): E(R,R) contains one element, the set N_R, and it has to be mapped to R*C. We call this mapping a_0, and so we have

- Code: Select all
`(1) The number of RxC bands is (C!)^(R^2) * N(R,R,C,a_0)`

(Notice that this does not imply the stronger trivial fact that the number is divisible by (C!)^(R*(R-1)) * (RC)! * (R-1)!, so this division has to be carried out at the end of the algorithm. The (R-1)! factor hasn't been mentioned by others, but comes from the permutations of box 2,..,R)

We now need an algorithm to express N(L,R,C,a) in terms of lower values for L (that corresponds to breaking up an RxC L-subband configuration into smaller subband configurations), and in the other end of the algorithm, to find N(1,R,C,a).

The last part is trivial:

- Code: Select all
`(2) N(1,R,C,a)=1.`

That is because given an RxC 1-subband (i.e. box) configuration B, the mapping F(B) uniquely describes B, and so Z(1,R,C,F(B))={B}.

For the breaking-up part, we need more notations:

Given M such that 1<=M<L. Define E(M,L,R) to be the subset of E(M,R) x E(L,R) of pairs (V,U) such that V is a subset of U. Given a mapping b:E(M,L,R) -> N, we can define a new mapping ^b:E(L-M,L,R)->N by ^b(V,U)=b(U-V,U), and we can define a mapping a(b):E(M,R) -> N by

a(b)(V) = Sum_U b(V,U)

where the sum is over all U in E(L,R) such that U contains V.

Given a mapping a in A(L,R,C), we define the set A(M,L,R,C,a) to be the set of mappings

b:E(M,L,R)->N

such that

1. For every U in E(L,R), Sum_V b(V,U)=a(U) where the sum is over all V in E(M,R) such that U contains V

2. a(b) lies in A(M,R,C)

It can now be proven (details left out) that a mapping b:E(M,L,R)->N lies in A(M,L,R,C,a) if and only if ^b lies in A(L-M,L,R,C,a).

b describes how many symbols go where, when we break up the RxC L-band into an RxC M-band and an RxC (L-M)-band.

The result for N(L,R,C,a) is now done by summing up for N(M,R,C,a(b)) and N(L-M,R,C,a(^b)), and the number of ways to choose which symbols go where as described by the b-mapping. More formally:

- Code: Select all
`[ Prod a(U)! ]`

(3) N(L,R,C,a) = Sum [ N(M,R,C,a(b)) * N(L-M,R,C,a(^b)) * ------------ ]

[ Prod b(V,U)! ]

where

- The sum is over all b in A(M,L,R,C,a)

- The first product is over all U in E(L,R)

- The second product is over all (V,U) in E(M,L,R)

For each step in the algorithm, the result when breaking up should be independent from the choice of M, but I'm pretty sure the fastest result is achieved by letting M be as close to L/2 as possible, i.e. M=L/2 for L even and (L-1)/2 for L odd.

For L=R, a simplification: Symmetries allow us to group the mappings of the first splitting into equivalence classes being orbits of the effect of permuting the elements in N_R. Two elements in the same equivalence class give the same counting result.

[Edit: I have removed another argument saying that for R even and L=R/2, there was a simplification through N(M,R,C,a(b))=N(M,R,C,a(^b)) for any b in A(L,R,R,C,a_0). It only holds for R=2,4, not for R>=6]

A good way to dig into the algorithm is to show that it leads to the 2xC, 3xC and 4xC formulas. (Eh... and also the C |-> C! formula for R=1)