6x2 counting

Everything about Sudoku that doesn't fit in one of the other sections

6x2 counting

Postby kjellfp » Thu Sep 14, 2006 7:14 am

From the endless thread last year :

kjellfp wrote:Counting the 4x2-sudokus took about a second. Counting the 5x2-sudokus took ~10 hours.

6x2, anybody?:)

Seems like I have become Mr anybody.

The 5x2 code was hopeless and slow, and can be rewritten to do the job in seconds maybe.

However, I have now started the 6x2-job. Or at least started analyzing it and writing the code, but time and progression is small.

The 6x2 counting, like the previous Nx2 countings, is done by identifying equivalence classes of 6x2 band configurations, then sum up E(C)*B(C)^2 for each equivalence class C, where E(C) is size of class and B(C) is number of bands for one representative of each class.

Finding classes and their sizes seams trivial (but havent been done yet), probably done in some hours. The hard job is doing the band count, but by splitting up the bands, I have estimates saying that the counting can be done in 4-5 weeks (running time). That's faster than 4x3 counting, which surprises me a bit.

I'll update this thread when I do some prgress.
Posts: 140
Joined: 04 October 2005

Postby kjellfp » Thu Sep 21, 2006 9:08 pm

Probably very few, if any, care about this, but still I try to give an overview on the algorithm used.

As said, finding classes of band configurations, and counting their sizes isn't the hardest part. It's already done, and took about one hour, giving 63199 classes. (A band configuration is a specification of the 6 elements in each minicolumn, but not their internal positions)

Counting the number of bands for each configuration is the real job. Doing it the direct way, by going through all permutations of the elements in each minicolumn, and count when they all fit with the row constraint, is to slow. Instead I look at all ways to split up a band configuration in two subbands of 4 and 2 rows each, where every symbol is twice in the 2-row part, and 4 times in the 4-row part, and then multiply the number of bands for each 2-row and 4-row configuration. An example:

The configuration
Code: Select all

can be split in many ways, like
Code: Select all
+--+--+--+--+--+--+              +--+--+--+--+--+--+
|23|32|15|12|13|14|              |13|12|35|12|13|62|
|95|74|36|45|26|65|              |95|35|48|49|26|74|
|B8|9B|78|67|4B|7A|              |B7|68|79|6A|4A|8B|
|CA|AC|AB|89|9C|8C|      or      |CA|7C|AC|8B|5B|9C|
+--+--+--+--+--+--+              +--+--+--+--+--+--+
|14|15|29|3A|58|32|              |24|94|16|35|78|15|
|67|68|4C|CB|7A|9B|              |68|AB|2B|C7|9C|3A|
+--+--+--+--+--+--+              +--+--+--+--+--+--+

The number of 2-row subbands from a given configuration is simply 2^C where C is the number of cycles: Two numbers are in the same cycle if they are in the same minicolumn in the 2-band. In the first case, the cycles are

1-6-1 and 4-7-5-8-A-B-2-4 and 9-C-3-9, giving 2^3=8 bands,

and in the second they are

2-6-B-4-8-C-3-1-2 and 9-A-5-7-9 giving 2^2=4 bands

Counting the 4-row subbands is best done by again splitting up into subbands of size 2+2.

And instead of starting with a 6-row configuration and split it into 2+4, I start with a 4-row and glue it together with a 2-row to get a 6-row band. That is because looking up 6-row bands (61399 classes) is much easier than looking up 4-row bands (>900.000.000 classes). It's also much more direct and less time consuming to find the ways to glue than the ways to split.

Currently about 35% of the count jobs for the 4-row bands are done (estimated total of 2 weeks), while I havn't started the gluing of 4-row into 6-row (estimated total of 2-3 weeks).

The 6x2 count is the last proper NxM-sudoku count that is within reach today (proper means non-latin squares, i.e. N,M >= 2) as I see it. All proper counts on boards of size up to 12x12 are then done, the next is 7x2, which at least is beyond what I can do.
Posts: 140
Joined: 04 October 2005

Re: 6x2 counting

Postby kjellfp » Tue Oct 31, 2006 11:16 pm

I'm a bit behind schedule, as the last time has been devoted to testing against known cases, for possible bug detections.

Currently, my methods have been rewritten to count (and correctly reconfirm) the 5x2 number. As the numbers counted in the first part can be used to find the number of 4-row subbands of the 6x2 bands, this number has also been calculated more directly to confirm that my numbers so far are right. By number of 4-row subbands, I mean the number of ways to fill in the first four rows of a 6x2 band.

The same direct method can also be used to get the number of 6x2 Sudoku bands. This will also be possible to get from the final count, and will be a very argument for the correctness of the final result.

The number of 6x2 bands is 4876139207527966044188061990912000 or 4.9e+33.

From the KSP-formula (se Wikipedia), we then get the expected number of 6x2 Sudoku grids to be


The final count will complete in about two weeks.
Posts: 140
Joined: 04 October 2005

Re: 6x2 counting

Postby kjellfp » Tue Nov 14, 2006 12:31 am

I'm still a bit uncertain about whether anybody cares about this now, but anyway, the 6x2-grid counting is now complete, which is a great moment for me at least.

I ended up with the number of 6x2 Sudoku grids to be

which is 3.8e+49. For factorization, I end up with

(12! * 6! * 2^5 * 5!) * 2^24 * 3^4 * 5 * 4255797680395107022515709

From the symmetries, the number was known to be divisible by 12! (choices for first box) * 6! (operations on rows in second band) * 2^5 * 5! (column+stack operations leaving the first stack untouched), therefore I felt it was right to sort out this as a factor on its own.

I don't know for certain about the last factor, but 100000 Miller-Rabin probabilistic primality tests did not expose it as a composite, so it's most certainly a prime.

The estimate in the previous posting on 3.820501e+49 had a relativistic error of -0.238% compared to the correct answer.

I'm very sure about the number being correct. The calculated values used to get the number of grids were also used for an exact reconfirmation of the number of 6x2 Sudoku bands mentioned in the previous posting.
Posts: 140
Joined: 04 October 2005

Postby Havard » Tue Nov 14, 2006 1:00 pm

That is really fantastic work Kjell!:)
Posts: 377
Joined: 25 December 2005

Re: 6x2 counting

Postby Red Ed » Sat Nov 18, 2006 1:24 pm

kjellfp wrote:I'm still a bit uncertain about whether anybody cares about this now, but anyway, the 6x2-grid counting is now complete, which is a great moment for me at least.
Very good! I wonder if you'll be tempted to go after the as-yet-unknown number of order 12 Latin squares next ...?:)

Can you tell us how long it took? I'd like to update the Wikipedia article with that information (or you could do it yourself).

The estimate in the previous posting on 3.820501e+49 had a relativistic error of -0.238% compared to the correct answer.
Hmm, once again we have an underestimate from the KSP formula. I would guess that this is always the case, with the relative error tending to zero asymptotically (for regions of size R by floor(kR), R -> infinity, k fixed), but have no idea how to prove that. It would be a nice piece of mathematics to have tidied up if anyone out there is feeling inspired.
Red Ed
Posts: 633
Joined: 06 June 2005

Re: 6x2 counting

Postby kjellfp » Mon Nov 20, 2006 8:02 pm

Let KSP(N,M) be the expected number of NxM Sudoku grids, derived from the KSP formula, and let G(N,M) be the actual number of grids. You ask whether KSP(N,M) < G(N,M) for all N,M > 1. I see reasons for why it should hold in general, and I have an explicit proof for the case M=2 (or N=2).

For general N,M, a vague idea: Given two (NM)x(NM) grids divided into boxes of size NxM, the first obeys the row and box constraints, the other the column and box constraints. Let C(b) be the condition that the two grids are equal in a box b. The idea behind the KSP-formula is the assumption that the C(b) are independent for the different b. This is not entirely true; if C(b) holds for a (large) set of b, it seems reasonable to assume that the elements of the grids are more likely to be "forced" into the same positions for the remaining b. So it seems that the two grids are equal a little bit more than 1 out of (NM)!^(NM) times, making KSP an underestimate.

For M=2

Let C_1, ..., C_T be the column configurations of Nx2 Sudoku bands. There are Ch(2N,N) ways to choose each Nx2 box configuration (Ch=Binomial), with N boxes we have T=Ch(2N,N)^N.

Each configuration C_i is subject to B(C_i) bands. Let B_0 be the average value of the B(C_i), and define e_i such that B(C_i) = B_0+e_i. Then the sum of all e_i is 0.

The number of Nx2 bands is B_(N,2) = sum_i B(C_i) = T * B_0.
The number of 2xN bands is known to be B_(2,N) = (2N)! * N!^2. Then

Code: Select all
           B_(N,2)^2 * B_(2,N)^N
KSP(N,2) = --------------------- = T * B_0^2

For the actual number, remember that when the configuration of the upper band in a grid is given, then so is the configuration of the lower band (swap columns in each box), and they each have the same number of bands. Therefore

Code: Select all
G(N,2) = sum_i B(C_i)^2 = sum_i B_0^2 + 2*sum_i B_0*e_i + sum_i e_i^2

       = T * B_0^2 + 0 + sum_i e_i^2 = KSP(N,2) + sum_i e_i^2.

So G(N,2) >= KSP(N,2) with equality if and only if all B(C_i) are egual. If C_0 is a configuration where all boxes are equal, then B(C_0) = L^2 where L is the number of NxN Latin squares. On the other hand, let C_1 be a configuration where N-1 boxes are equal, while the last differs from the others by swapping a and b, one element from each column. Then you can show that the bands of C_1 correspond to those of C_0 where a and b are on the same row in the last box. Therefore B(C_1) = L^2/N < B(C_0), and so G(N,2) > KSP(N,2).

I have the time logs for 6x2, I'll come back to that later.
Posts: 140
Joined: 04 October 2005

Return to General