Four-Row Cover

Everything about Sudoku that doesn't fit in one of the other sections

Four-Row Cover

Postby holdout » Wed Apr 03, 2013 8:46 pm

To construct a 3-row cover, one needs 416 3-row representatives such that any "spin" and "relabel" operation on a representative is different from any "spin" and "relabel" operation on any other representative.

Any such set with this quality is a "cover". Still, it's nice to construct such sets with some order in mind. My set of 416 representatives are kept in row-minimal-form. That is, no spin-and-relabel operation will produce a smaller 27-digit number (after row concatenation).

The 416 cover representatives fall into 44 equivalence classes, where the class is defined by the number of ways the grid can be completed to form a legal finished Soduko matrix. Reference: "Enumerating possible Sudoku grids", by Felgenhauer and Jarvis; June 20, 2005.

I am attempting to construct a 4-row cover, and have reached the stage where the veracity of the results are in doubt. I would like to describe my methodology and present some results, hoping that some of you smart people
can either confirm or refute the findings. Here it goes ...

For counting purposes, there are ordering restrictions placed on digits in column-1 of rows four thru nine. They are r4 < r5 < r6 and r7 < r8 < r9 and r4 < r7.

For each 3-row representative, I tacked-on all possible fourth rows and then counted (semi-brute force) the number of completions to a full grid. There were 834148 such representatives. Regarding the number of completions, each
representative fell into one of 7739 equivalence classes.

Code: Select all
      AAAAAA BBBB    A :: 4-row representative count
           4  432    B :: Number of configurations
           6  864
          18 1296
         288 2592
         504 3888
      833328 7776
      ======
      834148


I then considered that the 834148 representatives might not be mutually exclusive, regarding spin and relabel operations. So I "spun" each one into its row-minimal-form and found that their were indeed many duplicates.
After removing duplicates, my representative set was reduced to 790171 4-row entries. There were still 7739 equivalence classes.

Code: Select all
      AAAAAA BBBB    A :: 4-row representative count
           4  432    B :: Number of configurations
           4  864
          12 1296
         185 2592
         387 3888
      789579 7776
      ======
      790171


All seemed to be going well until my final check. My experience has been that the cross product sum of configurations times completions for each representative should be a constant.

Code: Select all
   Constant = Sum { configurations[i] * completions[i] }
       = 255,322,533,620,736


However, for my set of 790171 4-row representatives, the cross product sum is 299,540,420,207,424 (about 17% bigger).

Here follows a small table comparing 1-row, 2-row, 3-row, and my probably faulted 4-row cover.

Code: Select all
Legend:
  A = Sum{ configurations[] }
  B = Sum{ completions[] }
  C = Sum{ configurations[] * completions[] }

  Set  AAAAAAAAA  BBBBBBBBBBBBBBB  CCCCCCCCCCCCCCC
   1           1  255322533620736  255322533620736
   2       12096     317247725328  255322533620736
   3     2612736      40715927744  255322533620736
   4  6141771216      38540330792  299540420207424  Wrong!

Total Number of Grids
  = 6670903752021072936960
  = (9!) * 72 * 255322533620736


I would like a little help. Can some of you smart people either confirm or refute these findings?
holdout
 
Posts: 35
Joined: 30 August 2005
Location: Bowie, Maryland USA

Re: Four-Row Cover

Postby blue » Fri Apr 05, 2013 9:59 pm

Hi holdout,

I can confirm your numbers 834148 and 790171.
I get a different breakdown by "configuration counts", though, for the 834148 number.

Where you get this:

Code: Select all
      AAAAAA BBBB    A :: 4-row representative count
           4  432    B :: Number of configurations
           6  864
          18 1296
         288 2592
         504 3888
      833328 7776
      ======
      834148

      sum of products: 6141771216     ** wrong ** -- it's 6482694816

    Edit: I see I got the wrong table here. Not sure what the implications are.
    This was the relevant table.

    Code: Select all
          AAAAAA BBBB    A :: 4-row representative count
               4  432    B :: Number of configurations
               4  864
              12 1296
             185 2592
             387 3888
          789579 7776
          ======
          790171

          sum of products: 6141771216
I get this:

Code: Select all

      AAAAAA BBBB    A :: 4-row representative count
        2016   72    B :: Number of configurations
        2000  144
        2016  216
       10008  432
        8064  648
       32064 1296
        4032 1944
        3996 2592
      210528 3888
      559424 7776
      ======
      834148

      sum of products: 5238782208

    Edit: I guess the bottom line would be that my table above, would be replaced by one where the sum of the "representative counts" is 790171, but the list of "number of configurations" values could be different -- where when multiple entries from the 834148 count are combined, thier configuration counts would be summed. The end result would be that the sum of products is still 5238782208.
The the two sums differ by the factor of ~17% that you mention.

I also get a different count for "equivalence classes" 17292 -vs- your 7739.
How are you defining equivalence classes ?

I've done things 4 different ways:
  • Using 416 canonical bands, or 44 band signatures.
  • Requiring the the r4c1 to be the first available number in column 1, or not.
I get different equivalence class count, depending on the method used:
  • 44 signatures, smallest r4c1: 88052 inputs -> 14126 classes
  • 416 bands, smallest r4c1: 834148 inputs -> 17292 classes
  • 44 signatures, unrestricted r4c1: 530528 inputs -> 17294 classes
  • 416 bands, unrestricted r4c1: 5014576 inputs -> 17294 classes
I think it's to be expected that the last two counts should match, and the first two counts could be smaller.
It doesn't seem that 7739 could be correct -- but I can't say for sure, and maybe I've missed something in defining equivalence classes.

What I'm doing to define them, is equivalent to this (in theory):
  • apply an arbitrary chute permutation
  • remap numbers in any way that satisfies: b4 contents = { 1,2,3 } in some order, b5 = { 4,5,6 }, b6 = { 7,8,9 }
    Edit (2nd): corrected very unfortunate mislabeling of boxes -- b4,b5,b6, not b3,b4,b5
  • sort the numbers in each column -- smallest at the top, disregarding the box boundary
  • permute the columns within the chutes, to give the smallest 3*4 = 12 digit number, when the columns are read off like (r1c1,r2c1,r3c1,r4c1,r1c2,r2c2,..., r4c3), and similar for the other two chutes.
  • choose as the "canonical" repersentative, the option that makes the smallest 3*12=36 digit number, when numbers from the chutes are concatenated.
Whether that's the best choice or not, when I sum over the equivalence classes, the number of configurations mapping to the class, times the number of r56789 completions, I do get the 255322533620736 number that you were expecting (or 6 times that (as expected) when there are no restrictions on r4c1).

Good luck ...

Regards,
Blue.
Last edited by blue on Mon Apr 22, 2013 1:36 pm, edited 2 times in total.
blue
 
Posts: 1059
Joined: 11 March 2013

Re: Four-Row Cover

Postby holdout » Sat Apr 06, 2013 3:31 pm

Blue, thank you for your well-written and detailed reply. It is reassuring to me that the numbers 834148 and 790171 have been seen in the context of a four-row covering set of representatives. You certainly are doing something right, as evidenced by the appearence of the magic number (255322533620736).

This is how I define an equivalence class: If multiple 4-row representatives have the same number of completions, then they are in the same equivalence class. Also, I believe, if you have 'n' representatives (mutually exclusive regarding spin-and-relabel operations) which form a "cover", then: (Sum { configurations[i] * completions[i] }, for i = 1, ... , n) = 255322533620736.

You say you've "done things 4 different ways" and you cite 88052, 834148, 530528 and 5014576 different "inputs". Certainly, these "inputs" are not-mutually exclusive. If inputs which are essentially the same are discarded, how many remain? Might you get 790171 (= 'n') in each case? Is it from these representatives alone that you can derive the magic number?

In the two cases you cite which begin with "44 signatures", I do not really understand how you progess to 4-row representatives. That's OK. I'm really most interested in the case: "416 bands, smallest r4c1: 834148 inputs -> 17292 classes".

Given a 4-row represetative, I compute 'number_of_configurations' by performing all 7776 legal row and column interchanges (row-4 doesn't get interchanged) and then relabel to make the first row 0 thru 8 (0-based clues). If the resulting 36-digit number is a minimum, I save it. I also count the number of times ('mincnt') the minimum value occurred. Finally, number_of_configurations = 7776 / mincnt.

I'm not happy that my table of configuration occurences differs so wildly with your table. Your table must be correct, and I infer that there are 5238782208 different 4-row configurations.

I have selected five 4-row representatives from my list of 790171:

Code: Select all
               First Four Rows             Configurations    Completions
   012345678 345678012 786120453 120453786        432            77752
   012345678 345678012 678120534 120786345        864            76070
   012345678 345678012 678012453 120786345       1296            52668
   012345678 346178502 785620143 120453867       3888            66696
   012345678 345678012 687120534 120487365       7776            52344

If you could provide a similar table, I'm pretty sure I could sort-out the discrepancies.

Regards,
holdout
Last edited by holdout on Sun Apr 07, 2013 5:28 pm, edited 1 time in total.
holdout
 
Posts: 35
Joined: 30 August 2005
Location: Bowie, Maryland USA

Re: Four-Row Cover

Postby holdout » Sat Apr 06, 2013 9:07 pm

Blue,

I re-read your procedure to choose a canonical representative. I suppose "b3 contents" refers to r1c3, r2c3 and r3c3. Later, you "disregard the box boundary". So row-4 gets thrown into the mix. In my procedure, row-4 stays at row-4 (but column get interchanged).

Am I correct in thinking that your canonical representatives do not have the properties of a legal 4-row Soduko fill?

I wish that my representatives be legal 4-row fills (as are the set of 416 3-row fills).

Regards,
holdout
holdout
 
Posts: 35
Joined: 30 August 2005
Location: Bowie, Maryland USA

Re: Four-Row Cover

Postby blue » Sat Apr 06, 2013 11:34 pm

Hi holdout,

A few things to cover here ...

First, I should mention that I made a mistake in my first post.
I made some edits in the post, that you can refer to.

Next, this is probably of little importance, now that I see that we defined the "configurations" number differently, but here is the sample data that you asked for:

Code: Select all
               First Four Rows             Configurations    Completions
   012345678 345678012 678012345 103254786         72            52776
   012345678 345678012 786120453 103254786        144            48012
   012345678 346782501 785016234 103254786        216            52776
   012345678 345678012 678120534 103254786        432            45556
   012345678 345678012 678021354 103254786        648            48460
   012345678 345678012 678012453 103254786       1296            50562
   012345678 345678012 678012354 103254786       1944            50038
   012345678 346078215 587126034 103254786       2592            49874
   012345678 345678012 678021435 103254786       3888            46124
   012345678 345678012 678021453 103254786       7776            48818


What I'm using for a "number of configurations", is number the ways to fill band 1 (out of 2612736), that can be transformed ("spin & relabeling") to match the first three rows in the 4-row representative.

This is how I define an equivalence class: If multiple 4-row representatives have the same number of completions, then they are in the same equivalence class.

More on this in a bit, I do get that there 7739 distinct completion counts, for the (17294) equivalence classes that I used.

holdout wrote:I re-read your procedure to choose a canonical representative. I suppose "b3 contents" refers to r1c3, r2c3 and r3c3. Later, you "disregard the box boundary". So row-4 gets thrown into the mix. In my procedure, row-4 stays at row-4 (but column get interchanged).

Am I correct in thinking that your canonical representatives do not have the properties of a legal 4-row Soduko fill?

I wish that my representatives be legal 4-row fills (as are the set of 416 3-row fills).

There's a little typo in that: "b3 contents" refer to r3c1, r3c2 and r3c3.

What I was after in defining my classes, was that any two 4-row fills, that could be transformed in a way there the would leave the same pencil marks in rows 5-9, would be in the same class. That would guarantee "before the fact", that members of the same class, have the same completion counts.

Towards that end: The relabling to ensure the specific contents in b3,b4,b5, was to force b3 to be missing the numbers 456789, b4 to be missing 123789, and b5 to be missing 123456. The bit about sorting the contents in the columns, and disregarding the box boundary, was just to get a unique representation for the (complement of the) set of numbers that are missing in the individual columns -- which is the remaining thing that's relevant (given what is done with the boxes) for determining which pencil marks would go in the empty cells.

For the 7739 different counts that actually did arise (across 17294 classes), I don't know if it's just coincidence that some classes have the same counts (?), or if it could have been known in advance -- maybe I missed something.

holdout wrote:Your table must be correct, and I infer that there are 5238782208 different 4-row configurations.

I'm not quite sure what kind of description fits that number.
I think the correct number, comes from the run with no restrictions on r4c1 contents:

Code: Select all
  AAAAAAA BBBB    A :: 4-row representative count
    12096   72    B :: Number of configurations
    12000  144
    12096  216
    60288  432
    48384  648
   192864 1296
    24192 1944
    24088 2592
  1265984 3888
  3362584 7776
  =======
  5014576

  sum of products: 31491624960

It's around 6 times the other number, as might be expected -- but it isn't exact.

holdout wrote:You say you've "done things 4 different ways" and you cite 88052, 834148, 530528 and 5014576 different "inputs". Certainly, these "inputs" are not-mutually exclusive. If inputs which are essentially the same are discarded, how many remain? Might you get 790171 (= 'n') in each case? Is it from these representatives alone that you can derive the magic number?

For the numbers 834148 and 5014576, there are 790171 and 4051084 remaining after discarding equivalents. For the other two, I don't want to calculate what the actual number are, but the smaller one, 88052, is already less than 790171.

holdout wrote:In the two cases you cite which begin with "44 signatures", I do not really understand how you progess to 4-row representatives.

I think I can cover that, along with my "number of configurations" values, in one "story".
You probably know most of this, and I apologize for boring you with details, but ...

To calculate the number of ways to fill the entire grid, having 123456789 in the top row, one slow, but bulletproof way of doing it, is to loop over the 2612736 ways to fill the top band; for each one count the number of ways to complete the rest of the grid; and then total up the counts. To speed that up, we can group the band 1 fills into classes, where we know that the completion counts will be the same, and calculate the completion counts, just once for each class -- multiplying them by the class sizes, and totaling up the products. I should mention that this argument is virtually unchanged, even if the objective is to only count the number 4-row fills.

Then ... one way to form classes, is to note that two band 1 fills, can be transformed via "spin and relabeling", to the same canonical form ... one of 416 canonical forms ... must have the same number of completions. Using that idea, for each of the 416 forms, we would want a "multipler count" ("number of configurations"), that is the number of the 2612736 ways of filling band 1, that can be transformed into the given form.

Another way to form classes -- fewer of them, with larger multiplers -- is to say that two band 1 fills are in the same class, if they can be transformed in a way where they leave the same pencil marks in the cells in r4-r9. That's the origin of the 44 "band signature" classes. Each class has a multipler, and the multiplers still sum to 2612736, just as in the case with the 416 canonical bands.

Ahh crap -- I've got to run -- more later.
Maybe this will be enough ?

Regards,
Blue.
Last edited by blue on Mon Apr 08, 2013 1:13 pm, edited 1 time in total.
blue
 
Posts: 1059
Joined: 11 March 2013

Re: Four-Row Cover

Postby holdout » Sun Apr 07, 2013 3:48 am

Blue,

Code: Select all
Ahh crap -- I've got to run -- more later.
Maybe this will be enough ?


I had to laugh when I read this. There a lot for me to review. Maybe it will be enough. I probably won't post for a least a week. You have been a big help -- and quick.

Regards,
holdout
holdout
 
Posts: 35
Joined: 30 August 2005
Location: Bowie, Maryland USA

Re: Four-Row Cover

Postby blue » Sun Apr 07, 2013 10:35 pm

Hello again, holdout,

Code: Select all
Ahh crap -- I've got to run -- more later.
Maybe this will be enough ?

I had to laugh when I read this. There a lot for me to review. Maybe it will be enough. I probably won't post for a least a week.

In hindsight, I'm laughing too ... :)

I think I've found the root of the problem.
Somehow, it's in the restriction that you placed on the r4c1 values.

I had some time to think about the relationship between the two kinds of "number of configurations" values that we were using. The're basically the same thing -- just that mine were related to the top 3 rows of the fill, and yours to all 4 rows.

I've found that I can duplicate your two tables [ "834148" and "790171" ] using your definition.
This was the relevant one (as you surmized):

Code: Select all
      AAAAAA BBBB    A :: 4-row representative count
           4  432    B :: Number of configurations
           4  864
          12 1296
         185 2592
         387 3888
      789579 7776
      ======
      790171

When I make that table for the case where there are no restrictions on r4c1, I get this:

Code: Select all
    AAAAAAA BBBB    A :: 4-row representative count
          6  432    B :: Number of configurations
          8  864
         38 1296
        397 2592
       1852 3888
    4048783 7666
    =======
    4051084

    sum of products: 31491624960

That gives the same "sum of products" value as this table from my previous post:

Code: Select all
    AAAAAAA BBBB    A :: 4-row representative count
      12096   72    B :: Number of configurations
      12000  144
      12096  216
      60288  432
      48384  648
     192864 1296
      24192 1944
      24088 2592
    1265984 3888
    3362584 7776
    =======
    5014576

    sum of products: 31491624960

Connection between the two, is that if you took the 5014576 representatives that I used, and grouped them into classes that were equivalent with respect to "spin & relabeling transformations", you would get 4051084 classes, where the the sums of the "number of configurations" values each within class, would be the value that you would calculate for any (one) representative from the class -- the one with the smallest 36-digit number, in particular.

The 4051084 count, for 4-row fills, is like the number 416 for 3-row fills -- the number of "essentially distinct" ways to fill 4 rows. The "configuration count" numbers that you used, are like the ones that I used, except that where mine were the number of 3-row fills (out of 2612736) that can be maped to a representative from one of the 416 band classes, yours are the number of 4-row fills (out of 31491624960) that can be mapped to a representative from one of the 4051084 4-row classes.

For the 4-row classes, defined that way, to use them to calculate the total number of (9-row) grids, you would loop the 4-row classes; for each one, calculate the number of completions with r5c1 < r6c1 and r7c1 < r8c1 < r9c1, using any reperesentative from the class; multiply that by the "configuration count" for the class; sum the products; and multiply the final total by 12 * (9!). The '12' in the last multiplier, is the number of ways to permute the bottom 5 rows.

Your condition on r4c1, sort of amounts to the hope that somehow you can keep roughly 1/6 of the classes -- and calculate a sum that would be multiplied by 72 * (9!) instead if 12 * (9!) ... where the 72, there, is the number of ways to permute the bottom two bands and the rows witin the bands.

There must be some merit to that idea, since I was able do the calculation in a way that did include the r4c1 condition, and still got the "magic number". Originally, it used the 834148 representatives along, with a/my partitular set of multiplers. I mentioned in the "edit" notes for my first post, that the table that went with it, could be reduced to be "about" 790171 of those representatives -- by grouping the 834148 "inputs" into classes, and totalling up the multipliers. When I actually do that, what I get for the table of multiplers, has this (rather long) summary:

Code: Select all
   AAAAAAA    BBBB    A :: 4-row representative count
         2      72    B :: Number of configurations
         6     144
         3     216
        16     288
       324     432
        27     576
       942     648
         2     720
       494     864
         2    1008
        49    1152
     15154    1296
        49    1440
       424    1728
      3763    1944
        19    2016
       210    2160
         6    2304
     10836    2592
        20    3024
         4    3456
    182469    3888
       836    5184
        36    5832
        20    6480
    574458    7776
    ======
    790171

    sum of products: 5238782208 -- same as before

Obviously those multiplers are different from the ones that you used (... "number of configurations" values) -- but the "790171 representatives" would be identical.

I'll let you stew over that ...

Cheers,
Blue.
blue
 
Posts: 1059
Joined: 11 March 2013

Re: Four-Row Cover

Postby holdout » Tue Apr 09, 2013 12:04 am

Code: Select all
    AAAAAAA BBBB    A :: 4-row representative count
          6  432    B :: Number of configurations
          8  864
         38 1296
        397 2592
       1852 3888
    4048783 7666
    =======
    4051084

    sum of products: 31491624960


The 4051084 count, for 4-row fills, is like the number 416 for 3-row fills -- the number of "essentially distinct" ways to fill 4 rows. The "configuration count" numbers that you used, are like the ones that I used, except that where mine were the number of 3-row fills (out of 2612736) that can be maped to a representative from one of the 416 band classes, yours are the number of 4-row fills (out of 31491624960) that can be mapped to a representative from one of the 4051084 4-row classes.


The 4051084 representatives appears to be 4-row cover. What value do you get when you sum the 'number of configurations' * 'number of completions' over all representatives?

Regards,
holdout
holdout
 
Posts: 35
Joined: 30 August 2005
Location: Bowie, Maryland USA

Re: Four-Row Cover

Postby blue » Tue Apr 09, 2013 4:43 am

Hi holdout,

holdout wrote:The 4051084 representatives appears to be 4-row cover. What value do you get when you sum the 'number of configurations' * 'number of completions' over all representatives?

I get 1531935201724416, which is 6 * 255322533620736.

I do use that reduction to 17264 signature cases, that I mentioned, to speed things up -- where only 17264 actual completion counts needed to be calculated.
[ That was similar to the way that 416 band representatives, could be reduced to 44 band signatures cases. ]

Added: I have another speedup that I use too. For the 44 band signature cases, I have a table listing the number of ways that a band can be filled to match the signature. After rows 5 and 6 are filled, I produce a "signature" for the empty cells in band 3. Then I have a quick way to convert that signature into one of the 44 "canonical" band signatures, and with that, I can just look up the number of ways to complete the fill to 9 rows.

Regards,
Blue.
blue
 
Posts: 1059
Joined: 11 March 2013

Re: Four-Row Cover

Postby Red Ed » Sun Apr 14, 2013 1:33 pm

Posting here because my PM didn't get through.

holdout wrote:The 416 cover representatives fall into 44 equivalence classes, where the class is defined by the number of ways the grid can be completed to form a legal finished Soduko matrix. Reference: "Enumerating possible Sudoku grids", by Felgenhauer and Jarvis; June 20, 2005.
The 44 equivalence classes were not defined by the number of completions, but rather by which of the 324 constraints (the "no repeated X in Y" axioms) they satisfied, up to isomorphism (spin & relabel). It was a happy coincidence that essentially-different constraints => different count. That coincidence does not carry over to the 4-row case.

By that methodology from 2005, the 4-row equivalence class is defined by ( {r1c1,r2c1,r3c1},r4c1, {r1c2,r2c2,r3c2},r4c2, ..., {r1c9,r2c9,r3c9},r4c9 ). Order matters in the list (...), but not in the sets {...}. For brevity, you could relabel row 4 to 123456789 and just list the nine 3-sets.

Barring mistakes, it looks like there are 121664 such classes.
Red Ed
 
Posts: 633
Joined: 06 June 2005

Re: Four-Row Cover

Postby holdout » Wed Apr 17, 2013 7:45 pm

Red Ed -- thank you. Yes, I was fooled in the definition of "equivalence class". Having re-read the reference, I see that you were able to reduce the number of 3-row classes from 71 to 44 when identical completion values appeared in your counting program. These were not "happy coincidences", but were derived from overlooked causal reductions.

I do see that your extension of the Felgenhauer/Jarvis way of defining 4-row equivalence classes may lead (and does lead) to multiple classes having the same number of completions.

Barring mistakes, it looks like there are 121664 such classes.

Just so I get it right, exactly what number does this correspond to in the 3-row case?

From Blue's postings, I see he got the definition right. :)
holdout
 
Posts: 35
Joined: 30 August 2005
Location: Bowie, Maryland USA

Re: Four-Row Cover

Postby Red Ed » Wed Apr 17, 2013 9:37 pm

holdout wrote:I see that you were able to reduce the number of 3-row classes from 71 to 44 when identical completion values appeared in your counting program. These were not "happy coincidences", but were derived from overlooked causal reductions.
The 44 classes were derived from first principles, not from bug-fixes, and completion counts never contributed to their definition. Let me repeat: uniqueness of the completion counts was a happy coincidence.

What does 121664 correspond to in the three-row case? 44.
Red Ed
 
Posts: 633
Joined: 06 June 2005


Return to General