Hi
holdout,
A few things to cover here ...
First, I should mention that I made a mistake in my first post.
I made some edits in the post, that you can refer to.
Next, this is probably of little importance, now that I see that we defined the "configurations" number differently, but here is the sample data that you asked for:
- Code: Select all
First Four Rows Configurations Completions
012345678 345678012 678012345 103254786 72 52776
012345678 345678012 786120453 103254786 144 48012
012345678 346782501 785016234 103254786 216 52776
012345678 345678012 678120534 103254786 432 45556
012345678 345678012 678021354 103254786 648 48460
012345678 345678012 678012453 103254786 1296 50562
012345678 345678012 678012354 103254786 1944 50038
012345678 346078215 587126034 103254786 2592 49874
012345678 345678012 678021435 103254786 3888 46124
012345678 345678012 678021453 103254786 7776 48818
What I'm using for a "number of configurations", is number the ways to fill band 1 (out of 2612736), that can be transformed ("spin & relabeling") to match the first three rows in the 4-row representative.
This is how I define an equivalence class: If multiple 4-row representatives have the same number of completions, then they are in the same equivalence class.
More on this in a bit, I
do get that there 7739 distinct completion counts, for the (17294) equivalence classes that I used.
holdout wrote:I re-read your procedure to choose a canonical representative. I suppose "b3 contents" refers to r1c3, r2c3 and r3c3. Later, you "disregard the box boundary". So row-4 gets thrown into the mix. In my procedure, row-4 stays at row-4 (but column get interchanged).
Am I correct in thinking that your canonical representatives do not have the properties of a legal 4-row Soduko fill?
I wish that my representatives be legal 4-row fills (as are the set of 416 3-row fills).
There's a little typo in that: "b3 contents" refer to r3c1, r3c2 and r3c3.
What I was after in defining my classes, was that any two 4-row fills, that could be transformed in a way there the would leave the same pencil marks in rows 5-9, would be in the same class. That would guarantee "before the fact", that members of the same class, have the same completion counts.
Towards that end: The relabling to ensure the specific contents in b3,b4,b5, was to force b3 to be missing the numbers 456789, b4 to be missing 123789, and b5 to be missing 123456. The bit about sorting the contents in the columns, and disregarding the box boundary, was just to get a unique representation for the (complement of the) set of numbers that are missing in the individual columns -- which is the remaining thing that's relevant (given what is done with the boxes) for determining which pencil marks would go in the empty cells.
For the 7739 different counts that actually did arise (across 17294 classes), I don't know if it's just coincidence that some classes have the same counts (?), or if it could have been known in advance -- maybe I missed something.
holdout wrote:Your table must be correct, and I infer that there are 5238782208 different 4-row configurations.
I'm not quite sure what kind of description fits that number.
I think the correct number, comes from the run with no restrictions on r4c1 contents:
- Code: Select all
AAAAAAA BBBB A :: 4-row representative count
12096 72 B :: Number of configurations
12000 144
12096 216
60288 432
48384 648
192864 1296
24192 1944
24088 2592
1265984 3888
3362584 7776
=======
5014576
sum of products: 31491624960
It's around 6 times the other number, as might be expected -- but it isn't exact.
holdout wrote:You say you've "done things 4 different ways" and you cite 88052, 834148, 530528 and 5014576 different "inputs". Certainly, these "inputs" are not-mutually exclusive. If inputs which are essentially the same are discarded, how many remain? Might you get 790171 (= 'n') in each case? Is it from these representatives alone that you can derive the magic number?
For the numbers 834148 and 5014576, there are 790171 and 4051084 remaining after discarding equivalents. For the other two, I don't want to calculate what the actual number are, but the smaller one, 88052, is already less than 790171.
holdout wrote:In the two cases you cite which begin with "44 signatures", I do not really understand how you progess to 4-row representatives.
I think I can cover that, along with my "number of configurations" values, in one "story".
You probably know most of this, and I apologize for boring you with details, but ...
To calculate the number of ways to fill the entire grid, having 123456789 in the top row, one slow, but bulletproof way of doing it, is to loop over the 2612736 ways to fill the top band; for each one count the number of ways to complete the rest of the grid; and then total up the counts. To speed that up, we can group the band 1 fills into classes, where we know that the completion counts will be the same, and calculate the completion counts, just once for each class -- multiplying them by the class sizes, and totaling up the products. I should mention that this argument is virtually unchanged, even if the objective is to only count the number 4-row fills.
Then ... one way to form classes, is to note that two band 1 fills, can be transformed via "spin and relabeling", to the same canonical form ... one of 416 canonical forms ... must have the same number of completions. Using that idea, for each of the 416 forms, we would want a "multipler count" ("number of configurations"), that is the number of the 2612736 ways of filling band 1, that can be transformed into the given form.
Another way to form classes -- fewer of them, with larger multiplers -- is to say that two band 1 fills are in the same class, if they can be transformed in a way where they leave the same pencil marks in the cells in r4-r9. That's the origin of the 44 "band signature" classes. Each class has a multipler, and the multiplers still sum to 2612736, just as in the case with the 416 canonical bands.
Ahh crap -- I've got to run -- more later.
Maybe this will be enough ?
Regards,
Blue.