Su-Doku's maths

Everything about Sudoku that doesn't fit in one of the other sections

b(n,m) reversed?

Postby LarryLACa » Fri Dec 16, 2005 6:31 am

In the 12/12 post Re: the N(n,m) usage for the 4x3 grid estimate:
kjellfp wrote: The numbers given in October stem from the discussion in http://forum.enjoysudoku.com/viewtopic.php?t=44&start=412

For 3xN-sudoku, the formula just suggested should give

N(Nx3) = b(N,3)^3 * b(3,N)^N / (3N)!^(3N)

We can express b(3,N) as follows: Let M be the number of Nx3 box column configurations that "fits" with a fixed one (fits in the sence that column k of the two boxes have no elements in common for every k). For a stack of Nx3 boxes

the description talks about stacks. 3 columns, 3 boxes in the stack.. which applies to b(n,3) not b(3,n). Of course this has no impact on the validity.. A little latter we have:
kjellfp wrote:which is what I used in October.
When I first read this, I thought which meant the b(4,3) band N value was used, but the algorithm was different, and maybe the estimate would be (slightly) different. Amazingly, although the October 10/10/05 derivation was based only on 4x3 boxes, used a different algorithm and different rationale, the result is algebraiclly equivalent to the N(nxm) expression. Definitely a right method.:idea:
Code: Select all
The N 10/10 extimate with Nx3 boxes was:
  N(Nx3)= R^3 * M^N * N!^(6N) / (3N)!^(2N)

where
R = b(4,3), N=4, M= sum k=0..N (N ch k)^3

Using the b(3,C) band N expression for C=N=4 gives
b(3,4) = 3N! N!^6 sum k=0..N (N ch k)^3
  = 3N! N!^6 M, so

N(Nx3) = b(4,3)^3 * M^N N!^6^N / 3N!^2N
     = b(4,3)^3 * b(3,4)^4 / (3N!^2N * 3N!^N)
     = b(4,3)^3 * b(3,4)^4 / (3*4)!^(3*4)
     = N(n,m)

So, coming from 2 different approaches/rationales (tinfoil) and kjell 10/10, we have estimate expessions equivalent to N(nxm). This looks very solid.

There may be a third parallel to be drawn using the Chi^2 statistic based on the cross product of marginal distributions, but this is (more than) enough hand waving for now.

Larry.
LarryLACa
 
Posts: 32
Joined: 24 August 2005

Re: b(n,m) reversed?

Postby kjellfp » Fri Dec 16, 2005 9:39 am

LarryLACa wrote:
kjellfp wrote:We can express b(3,N) as follows: Let M be the number of Nx3 box column configurations that "fits" with a fixed one (fits in the sence that column k of the two boxes have no elements in common for every k). For a stack of Nx3 boxes

the description talks about stacks. 3 columns, 3 boxes in the stack.. which applies to b(n,3) not b(3,n).

b(3,n) is the number of bands with 3 rows, n columns in a box and 3 boxes in the band. It is by symmetry identical to the number of stacks with 3 columns, n rows in a box and 3 boxes in the stack.

b(n,3) is the number of bands with n rows, 3 columns in a box and n boxes in the band. It is by symmetry identical to the number of stacks with n columns, 3 rows in a box and n boxes in the stack.
kjellfp
 
Posts: 140
Joined: 04 October 2005

Col Config Completion equivalence : Counting Symmetry

Postby LarryLACa » Sun Dec 18, 2005 11:30 pm

(Excuse the length. This started as question, but I later (thought I had) found a (partial) answer. Ooops, there are still problems in the Counting Equivalence usage here.:( See later comments. I'll clean this up later.)

While working through the rationale for the T-Class derivation of the 6.7e21 number of solutions, I didn’t see how to bridge from permutation symmetry to counting symmetry. In our (me, Ed, Frazer) 11/26/0 posts on permutation/counting symmetry we describe how minicol configuration, i.e. counting symmetry, is sufficient for transmitting constaints between bands. E.g. this example band config constrains the lower band by the its minicol configuration, as used to establish the counting symmetry equivalence class.
Code: Select all
Example 1             minicol config1
123 479 568  --->     123 136 134 = gang-  124 357 689 125 367 489     
456 238 179  counting 456 258 268   of-44
789 156 234  symmetry 789 479 579   #1
The following example belongs to the same equivalence class as the above examples (using "gangster" within box minirow permutations), but maps to a different minicolumn config, where only the box and column permutations for boxes 2,3 are are applied
Code: Select all
Example 2             box2,3 perms       minicol config2
123 749 586   ---->   123 479 568  ----> 123 123 421
456 283 917           456 823 971        456 456 563
789 516 432           789 156 423        789 879 978
Why do the two minicol configurations each have the same number of band2,3 completions (CCC=Column Configuration Completions)? In the 11/26 discussion this was still an open question. Why do the box2,3 minirow permutations in rows 2&3, needed to map example 2 to example 1, not effect the number of completions?

That the two minicol configurations have the same number of completions is well demonstrated by Ed’s gang-of-44 confirmation of the 6.7e21 count. (and the T-Class confirmation). This can’t be a coincidence.

It could however be a ‘special’ property of the 3x3 grid or possibly the Nx3 grid. Our estimating and enumeration techniques rely heavily on CCC equivalence. If the CCC equivalence only holds for the 3x3 case, then ...

Formally (extending Guenter's T-Class usage):
Code: Select all
Let S be the collection of the subsets of {1,2,3,..,(R*C)} of size R.
Let T be the collection of (R*C)-tupels (S1,..,SRC) from S such that
 S1 u S2 u.. SC=SC+1 u SC+2 u.. S2C =...=S(R-1)*C+1 u.. SRC =
  {1,2,3,4,5,6,.. (R*C)} ,  where "u" is set-union.
Let Si represent the minicolumns of an RxC box band configuration (with R boxes)
Let 2 members from T be equivalent if one can be obtained
from the other by one of the C!^(R+1) * (R*C)!  minicol/box/label operations.

Let bi be a valid Sudoku (partial) solution for ti in T.
Let bj be another valid (partial) solution obtained from bi by permuting the minirow values within each box.
Let tj be the column configuration for bj.
How can we show that number of full grid solutions for ti: N(ti) = N(tj), for all i,j for Sudoku with RxC boxes? This may be sufficient:

Let B be the set of all solutions compatible with a given band configuration.
For a solution Bi, given a top box in a stack, consider the effect of exchanging two values (AB) in a (arbitrary size) box minirow:
Code: Select all
from     to
ABxx..   BAxx..
xxxx..   xxxx..
......   ......

To preserve the OneRule, propagate impact down the two respective columns: exchange the AB in each column in a lower box. For each value, in its lower box, also exchange other value in the box, to preserve the OneRule for the box. E.g. here for col 1 A lower box B->A, in some lower box
Code: Select all
from     to
xxxA..  xxxB..
Bxxx..  Axxx..
......  ......
Repeat the propagation until either all value pairs (in each box/row/col unit) are exchanged or until the propagation chain ends.
Thus every solution in B can be mapped to a solutin in B’, the set of solutions for exchanged values.
|B| = |B’|, since |B| is invariant under relabeling. That the process can be completed without conflict is an application of the ‘conjugate chain coloring rule’ described by Angus Johonson - Solving with colors (and elsewhere under other names...)

(This deriviation is not complete.:( It does not limit the initial pair-wise exchange to a box minirow, which is essential. The row-wise limitation constrains the (initial) impacts to the box and it's stack columns', thus preserving the completion count invariance down the stack(s). )
Last edited by LarryLACa on Tue Dec 20, 2005 1:49 pm, edited 1 time in total.
LarryLACa
 
Posts: 32
Joined: 24 August 2005

Re: Col Config Completion equivalence : Counting Symmetry

Postby Red Ed » Mon Dec 19, 2005 8:51 pm

LarryLACa wrote:Why do the two minicol configurations each have the same number of band2,3 completions (CCC=Column Configuration Completions)? In the 11/26 discussion this was still an open question. Why do the box2,3 minirow permutations in rows 2&3, needed to map example 2 to example 1, not effect the number of completions?
It's not always true that swapping a pair of values within a minirow of a band preserves the full grid completion count. For example, any single swap in this band will always mess up the full grid completion count:
Code: Select all
1 2 3   4 5 6   7 8 9
4 5 6   7 8 9   1 2 3
7 8 9   1 2 3   4 5 6
Your argument breaks down when the propagation chains you were considering spread through the grid and back up into the top band, thereby messing up more than just the initial A<->B swap in that band. For example, if you swap 8,9 in the first box above, it'll always trigger a propagation chain that flips the 8,9 pairs in boxes 2 and 3 as well.

btw, your use of counting vs. permutation equivalence is inconsistent with that which I originally proposed. Two bands are counting equivalent iff they have the same completion count: the definition is simply about counts, not transformations. On the other hand, two bands are permutation equivalent iff you can map from one to the other by the "usual" set of manipulations. The fact that P.E. => C.E. is clear; whereas the fact that, for 3x3 Sudoku, C.E. => P.E. is just as you would expect for large 'random' numbers (as Frazer pointed out).
Red Ed
 
Posts: 633
Joined: 06 June 2005

Re: Col Config Completion equivalence : Counting Symmetry

Postby LarryLACa » Tue Dec 20, 2005 7:48 am

Red Ed wrote:btw, your use of counting vs. permutation equivalence is inconsistent with that which I originally proposed. Two bands are counting equivalent iff they have the same completion count: the definition is simply about counts, not transformations. On the other hand, two bands are permutation equivalent iff you can map from one to the other by the "usual" set of manipulations. The fact that P.E. => C.E. is clear; whereas the fact that, for 3x3 Sudoku, C.E. => P.E. is just as you would expect for large 'random' numbers (as Frazer pointed out).


Ed is correct. My usage here is inconsistent, to say the least. I thought the mapping from box column configuration to the minimal gang of 44 column configurations could be accomplished using the transformations like the ones I described. As Ed points out, you have to work from the permutations that preserve the completion counts, which is NOT the set I described.

Thanks for the more than kind reply.

Larry

-edit-
P.S. I still have issues with the CE usage (where the emphasis is on the I have) and will reformulate the above after closer consideration.
LarryLACa
 
Posts: 32
Joined: 24 August 2005

Column Configuration Counting Equivalence

Postby LarryLACa » Thu Dec 22, 2005 10:16 am

The following reveals nothing new, but tries to clarify what I wanted to nail down last time. It highlights some useful manipulations, properties of column configuration counting equivalence, and how-to trap gangsters (as others have done before). I began by asking.. (Most edits in green or () )

What band permutations (perms within a band) preserve the number of full grid completions ( NC(band) )? How does a particular band column configuration (CC) relate to it's most general (gangster) equivalence class? [By CC for a particular band I mean the canonical CC associated with that band using a canonical box B1 numbering and permutations applied to the (remaining) column-cell, column, and box (2,3) ordering to achieve the minimal CC ID in our usual fashion.]

The short answer is (hindsight is wonderful:) ) Two (canonical) band CCs will have the same NC(band) if one can be permuted to the other using: relabeling (9!) and permutations for column-cells (6^9), box-columns (6^3) and boxes (6), i.e. 9! 3!^13 alternatives. (numbers reflect a 3x3 box grid, the 6 row permutations are subsumed under the column-cell permutations.) The rest of the post explains why this works and how to apply it to gangster hunting.


When will two (different) band CCs have the same NC(band) completions? Obviously, any Sudoku (solution) consistent permutations (of the two bands) also preserve the column configuration (CC) constraints (w.r.t completion counts) on the (lower) band boxes and preserve NC(band) (which is (part of ) the rationale for the CC ID construction). For the 3x3 box grid (band), that’s (at least) the obvious 6^5 column, row and box permutations. (No news here, already applied in constructing the CC ID) How do the 9! relabeling permutations apply (when used with a canonical B1)? The following logic (deduced from one Ed’s first 6.7e21 confirmation programs) shows how (to match CCs with equal NC(band) ) :

Given two band column configurations with respect to a canonical box 1 (B1) numbering:
Code: Select all
Example#1             minicol config1
123 479 568  --->     123 136 134 = gang-  124 357 689 125 367 489
456 238 179  counting 456 258 268   of-44
789 156 234  symmetry 789 479 579   #1

Example#2             minicol config2
some         --->     123 193 194   CCID   148 236 579 158 246 379
 other       counting 456 852 832
  band ...   symmetry 789 476 576
Begin by looking for permutations of box B1 that preserve the B1 CC constraints on the lower bands. These include the within column, cell permutations and the column permutations. (Rotating the columns in stack1, rotates the columns in each of the stack boxes, but the net effect on other boxes in each band zero: the stack1 minrow configuration constraints stay the same.)

For each such box B1 CC consistent permutation, relabel band1 to use this numbering scheme as a ‘new’ canonical numbering. Identify the original configuration as B1, B2, B3 and the band CC as DCC (D for banD), and the relabeled ones as B1’, DCC’… Thus for the N full grid completions:
N(DCC) w.r.t. B1 = N(DCC’) w.r.t. B1’

However since the B1’ relabeling was chosen to preserve N w.r.t. B1, it must also be the case that
N(DCC) w.r.t B1 = N(DCC) w.r.t B1’ = N(DCC’) w.r.t. B1’

So, any relabeling of the original B2, B3 that preserves the B1 CC constraints, will generate a band CC (DCC’) with the same number of completions. To demonstrate, Example#1 can be mapped to Example#2 using this renumbering:
Code: Select all
renum with  Example#1   minicol config2
123   189      --->     189 193 194   DCCID  148 236 579 158 246 379
456   453   Example#2   453 852 832   w.r.t
789   726               726 476 576   B1'
Hence any Band1 solution (using canonical B1 numbering) with the Example2 CCID (DCC’), will have the same number of grid completions as Example#1 and belongs in the same counting equivalence class (gang-of-44 #1, which is what Ed did in June 05)

The same logic for CC permuting and relabeling applies similarly to B2, B3 as well. Searching for all matches (of a given CC) over B1..B3 (relabelings) generates a gangster equivalence class.

We’ve noted before that band (solution) permutation equivalence (P.E.) ==> counting equivalence (C.E.).
{Wrong -ignore: Using the above logic, we can now state the inverse relationship: C.E. ==> P.E. iff a column configuration (CC) consistent relabeling for one of the boxes can be found, as described above.

I.e. two band column configurations (DCC), will have the same number of grid completions, and their two sets of band row configurations (the band rows for the DCC) will be permutation equivalent iff a box CC consistent relabeling can be found. Unfortunately, since the ‘subject to’ condition is so large, the statement really doesn’t say much, other than to identify where the limitation lies. --end ignore}

[Further edit: conform to Ed's P.E. definition which includes the within minicol cell permutations used to construct a CC ID. I had origininally (mis)used P.E. to designate only solution preserving permutations. Conform to Ed's C.E. definition which includes 'coincidental' same completion cunts. I had orginally C.E. as band configurations which must have the same completion count, i.e. synonomous with P.E.]


Thus, each reduced counting symmetry equivalence class (gangster) ia a superset of the (36288 band solution) equivalence classes that share (via P.E.) a common CC (constraint on the lower bands) and have the same NC(band). Two band solutions from a gangster class are solution equivalent iff they belong to the same (36288 band solution) based subset equivalence class.

We can also state another P.E. relationship: Two band CCs, with canonical B1, will have the same NC(band) if a B1 counting equivalent relabeling (B1 column-cell and column permutations) can be found, that maps the CC for the remaining B2,B3 to each other. This can be generalized to allow any box to be chosen as the canonical labeling box. (Yet another statement of the initial short answer.)


Larry[/i]
Last edited by LarryLACa on Tue Jan 03, 2006 9:09 pm, edited 3 times in total.
LarryLACa
 
Posts: 32
Joined: 24 August 2005

Re: Column Configuration Counting Equivalence

Postby Red Ed » Thu Dec 22, 2005 10:38 pm

You're doing it again, Larry! Your use of permutation equivalence and counting equivalence is inconsistent with that originally proposed.

Here are the concepts being discussed:
    (A) Band X is equivalent to band Y iff they give rise to the same number of grid completions. This results in 44 equivalence classes.
    (B) Band X is equivalent to band Y iff you can get from one to t'other by relabelling and/or the 3!^5 ops of permuting rows (3!), boxes (3!) and columns (3!^3). This results in 416 equivalence classes.
    (C) Band X is equivalent to band Y iff you can get from one to t'other by relabelling and/or the 3!^14 ops of permuting rows (3!), boxes (3!), columns (3!^3) and cells within columns (3!^9). This results in the same 44 equivalence classes as (A) above.
I'm calling (A) Counting Equivalence and (C) Permutation Equivalence. You appear to be calling (C) Counting Equivalence and (B) Permutation Equivalence, which obviously means we're talking at cross purposes most of the time. Can't we just standardise on my original definitions? If you want a name for (B) then why not call it Frazer equivalence, as it was his idea to take advantage of symmetries in the first place.

The equivalences are all related to each other as follows:
    (A) <=> (C), i.e. Counting equiv is the same as Permutation equiv. The <= part is obvious; the => part just happens to be true for 3x3 sudoku "at random" (and entirely unsurprisingly, as Frazer pointed out).
    (B) => (A), i.e. Frazer equiv implies Counting equiv. Obvious.
    (C) sometimes=> (B), i.e. Permutation equiv sometimes implies Frazer equiv. As you explained in your post. Just because two bands are related by the big set of 9! x 3!^14 ops doesn't mean they are necessarily related by the smaller set of 9! x 3!^5, except in special circumstances.
I hope this clears things up.
Red Ed
 
Posts: 633
Joined: 06 June 2005

P.E. C.E. clarity

Postby LarryLACa » Fri Dec 23, 2005 11:18 am

Ed wrote:I hope this clears things up.
[I still didn't have it right.
Edit: Replace /P.E./B.S.E/, where B.S.E.:: band solution equivalent w.r.t. the usual 9! 6^5 column, row and box band ops]


It did. I (think) I was only off in the conclusion. See my edits above. The body stayed (mostly) the same. I've noted the superset relationship between the gangster solutions and the (was P.E.) B.S.E. equivalence class solutions.

I've dropped the P.E. permutation # down to 9! 3!^13, since row perms are redundant w.r.t to the 6^9 column-cell perms.

I think we could have avoided some of this ping-pong by calling P.E. 'counting permutation equivalence' or some such. Both C.E., P.E. and B.S.E. rely on permutations and the term overuse (P for perms) makes it messy.

Of course, if we regard the pattern-based pair-swaps that reduced the number of solution equivalence classes from 416 (was P.E.) B.S.E. classes to 71 C.E. classes, as a type of pattern-based solution permutation, then the distinction between (was P.E.) B.S.E. and C.E. becomes blurred again. With the addition of pattern-based solution symmetry permutations, all solutions within a gangster class may be related to each other, even across (was P.E.) B.S.E. class boundaries. The fact that two (was P.E.) B.S.E. classes within a P.E. class have the same number of NC(band) completions must come from an inherent type of equivalence, but the pattern relating the equivalence is very hard to see or work with, so instead we work with the C.E. permutations.

Larry
Last edited by LarryLACa on Tue Jan 03, 2006 9:52 pm, edited 1 time in total.
LarryLACa
 
Posts: 32
Joined: 24 August 2005

Re: P.E. C.E. clarity

Postby Red Ed » Wed Dec 28, 2005 9:26 pm

LarryLACa wrote:I think we could have avoided some of this ping-pong by calling P.E. 'solution equivalence' or some such. Both C.E. and (S.)P.E rely on permutations and the term overuse (for perms) makes it messy.
Aaaaaaaaarrrrrrrggggghhhh! How can I make this any plainer? The definition of Counting Equivalence has absolutely nothing whatsoever to do with permutations: it is simply about the number of grid completions. Repeat after me. Nothing to do with permutations. I said NOTHING TO DO WITH PERMUTATIONS. Sorry, but this is really frustrating.

Computer programs never work directly with counting equivalence. Instead, they use what you might call structural equivalences, based on various completion-count-preserving transformations such as box permutations etc. The structural equivalences range from the trivial (fixing B1, taking you from 948109639680 bands to 2612736 equivalence classes) to the more complicated definition of permutation equivalence that I have advocated which results in 44 equiv classes. In between these two extremes, people have used equivalences resulting in 36288, 2051, 416, 306, 174 or 71 classes. You can try to give these all different names if you want, but I recommend reserving plain "permutation equivalence" for the comprehensive set of ops that gets you down to 44 classes.

Now ... let's take this further by defining permutation equivalence of RxC bands (R adjacent RxC boxes). In that case, I'd define permutation equivalence using the basic set of (RC)! x C!^R x R!^(RC+1) ops consisting of symbol relabelling, box switching, within-box column switching and within-column cell switching. Two bands are then permutation equivalent iff you can get from one to the other by a sequence of these ops. With this definition, permutation equivalence implies counting equivalence. It seems likely that for any particular (R,C), the converse is also true ... but no-one knows how to prove that without actually grinding through the whole grid-counting process.
Red Ed
 
Posts: 633
Joined: 06 June 2005

5x4 Band 1

Postby PatmaxDaddy » Mon Jan 02, 2006 1:10 pm

I'm running a program to count band 1 for a 5x4 grid. Combine this with Red Ed's 4x5 count (which I've also done and confirmed) using Kjell's generalization of my formula and we'll get a good estimate for the whole 5x4 grid.

I expect my 5x4 program to complete in about a week, maybe sooner. I'm sure it could be made more efficient, but for now I'm just letting it run. I'll post details when I get a chance.

I wonder if 5x5 band 1 can be done in a reasonable amount of time. I started trying to do that, but gave up and settled for 5x4. But it may be that more improvements in the 5x methods might bring 5x5 within reach.
PatmaxDaddy
 
Posts: 63
Joined: 18 October 2005

Re: 5x4 Band 1

Postby Red Ed » Mon Jan 02, 2006 7:02 pm

PatmaxDaddy wrote:I'm running a program to count band 1 for a 5x4 grid.
Great! How about 5x3 while you're at it, which is another gap in the list (on Wikipedia).

PatmaxDaddy wrote:Combine this ... using Kjell's generalization of my formula
On Wikipedia, I've credited the estimation formula to Kevin Kilfoil, who proposed it originally for the 3x3 case. IMHO the leap was to conjecture, accurately, that row and column constraints are very nearly conditionally independent given box constraints; rather than to then apply this heuristic to cases other than 3x3.
Red Ed
 
Posts: 633
Joined: 06 June 2005

Re: 5x4 Band 1

Postby kjellfp » Mon Jan 02, 2006 9:03 pm

PatmaxDaddy wrote:I'm running a program to count band 1 for a 5x4 grid.


Good! I think I have an algorithm for 5xC I want to implement. I thought it could be used for several values of C, and in theory it can, but after the Sudoku-experience I've gained, I know that things can go fast for N but take eternities for N+1. (For example, compare 3x3-grid counting to 4x3).

The algorithm works in theory for 6xC, 7xC etc. But it will quickly start getting to slow...
kjellfp
 
Posts: 140
Joined: 04 October 2005

5xC band 1 results

Postby PatmaxDaddy » Thu Jan 05, 2006 4:57 am

I have some 5xC band 1 results.

Code: Select all
5x3: 15! * (3!)^20 * 324,408,987,992,064
   = 1,551,020,353,525,523,622,687,719,797,507,334,602,752,000
   (1.5510e42)
   running time 92 seconds
5x4: 20! * (4!)^20 * 518,910,423,730,214,314,176
   = 5,075,067,768,812,480,461,469,472,014,698,208,829,322,375,
       837,511,588,659,215,728,640,000
   (5.0751e66)
   running time 68 hours

[Edited by request to add line breaks for easier display on smaller screens]

(Regarding the grid estimation formula, I agree that Kevin Kilfoil had the primary insight, but perhaps we might call it the KSP formula (Kilfoil/Silver/Pettersen) after those that have contributed to its development?)

Plugging the 5x4 result and the previously determined 4x5 result (which I've also calculated and confirmed) into the KSP formula gives an estimate for a 5x4 grid of 3.1764e175.

I don't have the 3x5 band 1 count handy, so I haven't calculated the KSP estimate for a 5x3 gird. I'll do that at some point, but anyone should feel free to do it.

I have some confidence in these results, but getting the 5x4 running time down to only 68 hours requred several algorithmic and programming tricks, so I would call this tentative for now. Confirmation from an independent method or implementation would be ideal, but I also have a variety of checks that I intend to perform in the near future (and I've already run several kinds of checks).

I'll post details on the methods I used as soon as I can.
Last edited by PatmaxDaddy on Fri Jan 06, 2006 12:22 am, edited 1 time in total.
PatmaxDaddy
 
Posts: 63
Joined: 18 October 2005

5x3 grid estimate

Postby PatmaxDaddy » Thu Jan 05, 2006 12:29 pm

If I've done Kjell's 3xC formula right, 3x5 band 1 is 15! * (5!)^6 * 2252, so the KSP estimate for a full 5x3 grid is 3.5086e84.
PatmaxDaddy
 
Posts: 63
Joined: 18 October 2005

Band 1 counting methods

Postby PatmaxDaddy » Thu Jan 05, 2006 10:46 pm

Band 1 counting for 3xC grids is very easy, because box 1 is fixed in canonical form, and box 3 is uniquely determined (up to row permutations) by boxes 1 and 2. The count reduces to the simple formula that Kjell posted, which specifies a computation that is simple enough to be done by hand.

RxC for R >= 4 is much harder. Kjell's 4xC formula specifies the count but not an efficient way to compute it. The key, as usual, is to find and exploit equivalence classes. For my 4xC and 5xC counts, I find a set of equivalence classes (gangsters) for box 2 where box 2 permutations are in the same class if the "structure" of the mapping from box 1 rows to box 2 rows is the same. This structure is represented by a directed, labelled graph (DLG) containing R nodes.

This can best be explained by example. Here are representative members of two gangsters from the 4x4 case:
Code: Select all
 Box 1         Box 2                           Box 2
0 1 2 3       C D E F                         C D E F
4 5 6 7       8 9 A B                         8 9 A B
8 9 A B       3 5 6 7                         1 2 3 7
C D E F       0 1 2 4                         0 4 5 6

         ___     4     ___               ___     4     ___
        /   \-------->/   \             /   \-------->/   \
        | 1 |         | 4 |             | 1 |         | 4 |
        \___/<--------\___/             \___/<--------\___/
          ^      3      |                 ^      1      |
          |             |                 |             |
          |1            |1                |3            |3
         _|_     3     _V_               _|_     1     _V_
        /   \-------->/   \             /   \-------->/   \
        | 3 |         | 2 |             | 3 |         | 2 |
        \___/<--------\___/             \___/<--------\___/
                 4                               4

Node "4" of the left DLG, for example, has an arc labelled "3" directed to node "1" that specifies that 3 symbols in box 2 row 4 came from box 1 row 1. Box 2 permutations are in the same class if their DLGs are isomorphic (identical topology and arc labels, but ignoring node numbers). Note that the above two DLGs are not isomorphic. The topology is the same, but the arcs labels are different, and indeed they yield different box 3 counts.

I have a lot of confidence in this method, and in the specific implementation I've coded, because it gives the right results for the 4x4 and 4x5 cases, which have been confirmed independently (I also ran the 4x4 case without the equivalence classes and got the same result).

There may be a simpler way of defining the box 2 equivalence classes, but if so I don't know what it would be.

Here are the number of box 2 gangsters for various cases:

Code: Select all
4x4:   28
4x5:   53
5x3:   89
5x4:  618
5x5: 3087


Using the gangsters, my 4x4 running time is 32 ms, and my 4x5 time is 830 ms. These could be improved significantly using methods I developed for the 5xC cases, but I didn't see any point in doing so.

The 5x3 count can be done quickly without further equivalence classes, but 5x4 was reduced from what would have been about one month to under three days by finding some simple equivalences for box 3. For any box 2 gangster, any arc in the DLG with a label >= 2 represents multiple symbols that moved together from one row of box 1 to another row of box 2. These symbols are interchangable in box 3 without affecting the box 4 count. Those that happen to be in the same row in box 3 do not yield a different case and so do not offer any help, but those that end up in different rows of box 3 can be counted just for a canonical configuration.

For example, consider the symbols 0, 1, and 2 for the left DLG above. These symbols can be interchanged in box 3. In permutations where they are all in the same box 3 row, there is no benefit. If two are in one row and one in another, there are three equivalent permutations. If all three are in different rows, there are six equivalent permutations. These equivalences can be used independently for all of the DLG arcs with labels >= 2, resulting in a huge advantage in many cases. For the 5x4 count, the one gangster with only two arcs whose labels are "2" took 4015 seconds, but the best case, with many independent equivalences, took only 7 seconds. There is also one gangster where all arcs have label "1", and in that case I use the box 3 / box 4 swapping equivalence to reduce the time by a factor of 2. On average each gangster took about 400 seconds.

I ran the 5x3 case with and without the box 3 equivalences and got the same result, so I trust the method. I also ran one 5x4 gangster without the box 3 equivalences and got the same result. Since the same source code is used for all of the 5xC band 1 counting, the 5x3 checks provide some confidence that the 5x4 results are correct. There are more such checks that I need to run, however.

I am convinced that much better box 3 equivalences can be found, which might reduce the 5x4 case to hours instead of days. But I suspect that even such improvements would not bring the 5x5 band 1 count within reach. We'd need a qualitatively different method.

I also used quite a few programming tricks to get the 5x4 count within reach. The first version of the code I wrote (no box 3 equivalences, very simple loops) would have needed 60 hours to do one gangster if I had let it run all the way. I sped it up to 3.6 hours for one gangster before letting it run all the way, and then made sure I got the same result as I sped it up further. I used permutation tables, some assembly language that takes advantage of the Pentium's bit scan instruction, and any other tricks I could think of.
PatmaxDaddy
 
Posts: 63
Joined: 18 October 2005

PreviousNext

Return to General