- Every row has 12 different symbols
- Every column has 12 different symbols
- Every base 4x3-box has 12 different symbols
A base 4x3-box is one of the natural disjoint 12 boxes of size 4x3 we can divide the grid into.
(The number of 3x4 sudoku grids is of course the same problem)
Below follows a brief summary on my contribution so far, together with some thoughts from dukuso and PatmaxDaddy.
In principle, the 4x3-counting is identical to the 3x3-counting. The process has two parts:
1. Precalcualtion on the 4x3-gangsters (bands of size 4x12). We need to find the size of each equivalence class, the numbers of bands for each gangster, and how this can be used to find the number of bands from any column configuration.
2. Counting. The number of loops is (#gangsters) * 346/2 * 346 * 346 * 346 (346 is the number of 4x3 box column configuration not in conflict with a fix configuration, just like the number 56 for 3x3-counting).
I dont know about the first part, but I hope it is possible in a few hours. The second will take much more time, maybe a year or two (at least 83 days on my PC if the inner loop had as few CPU clock ticks as when doing the 3x3-counting, but for this purpose, we can not store everything on the machine stack). The results from the first part should be stored to files, while the final counting should read in these files and produce regular output files, so it can pick up the thread on a rerun after the program stops. The final counting should of course also involve a task list on which gangsters to do the counting for, if we are splitting the work into several computers.
Memory is a problem here, so we must use reductions that we didnt need for 3x3-counting. First, notice that given a pair (B1,B2) of 4x3 boxes, they are configuration equivalent to a pair (Id,B2) where Id is the configuration {1,2,3,4},{5,6,7,8},{9,10,11,12} and B2 can be one out of 9 base boxes. We can recognize which of the base boxes to be used out of a 3x3-matrix where position (i,j) says how many elements column i in B1 and column j in B2 have in common. The 9 matrices (up to row permutation and column permutation) are
- Code: Select all
#1: 400 #2: 400 #3: 400
040 031 022
004 013 022
#4: 310 #5: 310 #6: 310
103 121 112
031 013 022
#7: 220 #8: 211 #9: 112
202 121 112
022 112 220
So given a band, to use it in lookup tables to find number of solutions etc, we transform it into a band configuration Id|B2|B3|B4 where B2 is one among the 9 base box configurations. There are then 5775 choices for B3 and B4 (just as there were 280 choices in the 3x3-counting). Totally we have 9x5775x5775 different band configurations to handle, this is about 300.000.000. That requires 300 MB for every byte we want to assign to each band.
So what do we want to assign? What we actually need it for, is to find the number of solutions for a configuration. We could store the band gangster index, but that requires at least 17 bits of memory (depends on the number of gangsters, which I only know must be more than 96752). We could store the number of solutions directly, but that will be 7962624 in the worst case, and 65719 in average. General cases might be trapped between a higher and lower limit whose difference is less than 2^16, and so the special cases could be handled differently. This is the most direct method with respect to looking up the values.
PatmaxDaddy had a nice remark to my thoughts. Though there are about 100.000 gangsters, the different unique numbers of solutions for a gangster could be much smaller (for 3x3-counting, there are 44 gangsters, but only 16 different numbers occur as the number solutions for a configuration), and so probably much less than 16 bits of information is required for each configuration.
dukoso has suggested to do part 1 first, and then think. I agree.
To lower the risk of ending up with a wrong number, we should rewrite the program we end up with to do the 3x3-counting first (but we introduce 'problems' on the numbers used, to have the difficulties on memory as for 4x3-counting).