(*: perfect = no bias, correlations etc.)
I have coded up one such generator as follows (using the Mersenne Twister as my not-quite-perfect RNG; thanks Mike Metcalf for the tip-off).
One-off work
- Define a 'stub' to be a grid with nine 1s, nine 2s and nine 3s filled in, but no other digits. There are T(3) = 5196557037312 stubs.
- Collect the stubs into isomorphism classes; two stubs are in the same class iff they are isomorphic (in the usual sense). It turns out that there are just 259272 classes, which we tabulate as follows:
- stub(k) = a representative grid;
- csz(k) = the class size, i.e. the number of equivalent stubs (~20 million on average);
- ext(k) = the number of ways of extending to a full solution grid (~1.3 billion on average).
- Write the stub(.), csz(.) and ext(.) tables to a file for later reference.
- Select a random class with Pr(class k) = csz(k) * ext(k) / 6670903752021072936960.
- Select a random solution number, n in {1,2,...,ext(k)}.
- Enumerate solutions to stub(k), stopping at the nth (call this solution G).
- Apply a random digit/band/stack/row/column/transpose permutation to G, and report the result.
- Let Solns(s) be the solutions to stub s. Then for each stub s in class k, Solns(s) is isomorphic to Solns(stub(k)). All Solns(.) are distinct, so there are csz(k) * ext(k) solution grids represented by class k. (Thus, class k should be picked with probability proportional to that.)
- Since the permutation (group op) in the last step is chosen flat randomly, it can be preceded by any other group ops (chosen randomly or not) without changing the average effect. In particular, we could apply a random permutation of {1,2,3} and bands/stacks/rows/cols/transpo on stub(k) prior to enumerating solutions. This has the effect of choosing a random stub in class k without bias which, coupled with the solution-picking step, results in unbiased selection of a solution grid from the class as a whole.
Finally, an application. We'd like to know how many proper (solvable, not necessarily minimal) sudoku puzzles exist. To answer this, we generate grids in an unbiased manner and, for each one, try to solve a random subgrid (that is, clear each cell independently with probability .5 then ask if what's left has a unique solution). Let P be the probability that a puzzle generated this way is proper; then the number of proper puzzles in total is P * 6.67e21 * 2^81. In 178100 trials, I found 65325 proper puzzles, so a 95% confidence interval for the total number of proper puzzles is [ 5.88e45, 5.95e45 ] to 3 s.f.