Canonical Form

Everything about Sudoku that doesn't fit in one of the other sections

Postby RW » Thu Jan 25, 2007 3:21 pm

JPF wrote:To be noted P(r7c1=3)~60%

That's obvious also without generating 1M random grids. Has to be a 3 in column 1, with r1234c1 occupied, there's 5 cells to choose from. 2 in box 4, which place 3 in r5c1, and 3 in box 7, which place the 3 in r7c1.

RW
RW
2010 Supporter
 
Posts: 1010
Joined: 16 March 2006

Postby ravel » Thu Jan 25, 2007 3:47 pm

JPF wrote:To be noted P(r7c1=3)~60%

Papy will like to hear that:) (but i fear he will not manage to canonicalize it without solving)

With a 3 in r7c1 Ocean's present for RW comes down from 24 to 4 points.

But bad luck with dml 1/07: r7c1=5
ravel
 
Posts: 998
Joined: 21 February 2006

Postby coloin » Thu Jan 25, 2007 8:18 pm

So the 3 at r7c1 - we can pick this as the best "back door" clue with this normalization ?

JPF wrote:1 or 2 were missing in r2c4,r2c8,r3c7,r3c8.
If r2c4 is not a 7 then it is always a 1.
That means the 2 is always in box2r3
That means the 2 is always in box3r2

Which leaves why 1 is never in r2c8 ?
If r2c4 is a 7 then the 123 are in b3r2
so if this occurs
Code: Select all
+---+---+---+
|123|456|789|
|456|789|21.|
|...|...|...|
+---+---+---+
|2..|...|...|
+---+---+---+
|...|...|...|
+---+---+---+

c7and c8 will be swapped.
But that doesnt explain why you would ever get a 1 in r2c9.


Anyhow, Interesting stats.

Our canonicalized grids which have 17s are different from random grids

The 6 at r2c3 occurred 20074 times out of 36637 = 54.8 %

removing duplicate grids this is unchanged

17304 out of 32247 = 53.9 % have a 6 at r2c3
14943 out of 32247 = 46.1 % have a 7 at r2c3

This is at odds considerably with random grids
Code: Select all
MBs         JPFs
6-24.00     6-28.9
7-76.00     7-71.1


Unless this is a quirk, this means a repeating minirow in at least one of the 6 bands in a grid is twice as common in grids with 17s, which possibly makes sense.

Presumably if one of the bands is a repeating minirow band then the min. lex. normalization puts this band first [?no]

Previous work done here seemed to indicate that the distribution of the gangsters was similar . I dont know why we accepted this...the SF grid has 2 out of the 6 bands with repeating minirows !
C
coloin
 
Posts: 2365
Joined: 05 May 2005
Location: Devon

Postby JPF » Fri Jan 26, 2007 3:14 pm

Here are some stats on the first band of the canonical* grids.
*smallest row order lexicographic value

There are based on a sample of 10^6 random grids.

We have seen before that there are 3 main possibilities :
Code: Select all
| 123 | 456 | 789 |
| 456 | 789 |(123)|
|(789)|(123)|(456)|


| 123 | 456 | 789 |
| 457 | 189 |(236)|
|(689)|(237)|(145)|


| 123 | 456 | 789 |
| 457 | 289 |(613)|
|(689)|(137)|(245)|
(abc) means a permutation of abc

If all these permutations were possible the number of bands would be = 3x6^4=3888.

Which is not the case...
Any idea on the exact number ?

Filtering the 1M random grids, I get only 351 different first bands.
Here is the most popular :
123 456 789 456 789 123 789 132 564
Code: Select all
| 123 | 456 | 789 |
| 456 | 789 | 123 |
| 789 | 132 | 564 |


Here are the 20 first with their percentage of occurrence** :
Code: Select all
        First Band                  P(%)                                                           
                                                                                                   
  123456789456789123789132564       2.6                                                           
  123456789456789132789132564       2.4                                                           
  123456789456789132789213645       2.3                                                           
  123456789456789132789231546       2.3                                                           
  123456789456789123789132546       1.4                                                           
  123456789456789123798132564       1.3                                                           
  123456789457189236689273415       1.3                                                           
  123456789456789123798132546       1.2                                                           
  123456789457189236689237415       1.2                                                           
  123456789456789123798213564       1.2                                                           
  123456789456789123798231564       1.2                                                           
  123456789456789123798231645       1.2                                                           
  123456789456789132789132546       1.2                                                           
  123456789457189236689237451       1.2                                                           
  123456789456789132789123546       1.2                                                           
  123456789457189236689237154       1.2                                                           
  123456789456789132789213456       1.2                                                           
  123456789457189236689273145       1.2                                                           
  123456789457189236689237541       1.2                                                           
  123456789456789132789213654       1.1                                                           


Here are the "rare" first bands with only 1 out of 10^6 occurrence** :
(27 first digits of these grids)
Code: Select all
123456789457189623986327451-235794168698512347714638592362875914579241836841963275
123456789457189632869372145-235768914718924356946513827384695271591237468672841593
123456789457189632896327451-231564978745918263968273514374691825589732146612845397
123456789457189632896372145-284635971571928463639714528345261897768593214912847356
123456789457189632896372451-281563974574928316639741528348617295762895143915234867
123456789457189632968327145-241678953536294817789531426394712568672845391815963274
123456789457289163698137425-234691857586374291719528346365812974871945632942763518
123456789457289163698137524-261593847534718692879642315385921476712364958946875231
123456789457289163698317524-236875491741963852985142376374628915512794638869531247
123456789457289613698317245-235174968761893524849625137372941856514768392986532471
123456789457289613896137254-289643571534871926671592438362915847745368192918724365
** in this sample
JPF
Last edited by JPF on Sat Jan 27, 2007 8:42 am, edited 1 time in total.
JPF
2017 Supporter
 
Posts: 6123
Joined: 06 December 2005
Location: Paris, France

Postby daj95376 » Fri Jan 26, 2007 4:16 pm

JPF wrote:We have seen before that there are 3 main possibilities :
Code: Select all
| 123 | 456 | 789 |
| 456 | 789 |(123)|
|(789)|(123)|(456)|


| 123 | 456 | 789 |
| 457 | 189 |(623)|
|(689)|(237)|(145)|


| 123 | 456 | 789 |
| 457 | 289 |(613)|
|(689)|(137)|(245)|
(abc) means a permutation of abc

Your third possibility does not seem to match the results in your previous table.
daj95376
2014 Supporter
 
Posts: 2624
Joined: 15 May 2006

Postby JPF » Fri Jan 26, 2007 5:23 pm

daj95376 wrote:Your third possibility does not seem to match the results in your previous table.

Could you elaborate ?
Thanks

JPF
JPF
2017 Supporter
 
Posts: 6123
Joined: 06 December 2005
Location: Paris, France

Postby daj95376 » Fri Jan 26, 2007 5:31 pm

Code: Select all
+-----------------------------+-----------------------------+-----------------------------+
| 1-100.0 | 2-100.0 | 3-100.0 | 4-100.0 | 5-100.0 | 6-100.0 | 7-100.0 | 8-100.0 | 9-100.0 |
+-----------------------------+-----------------------------+-----------------------------+
| 4-100.0 | 5-100.0 | 6-28.9  | 1-71.1  | 8-100.0 | 9-100.0 | 1-27.0  | 1-      | 1- 1.9  |
|         |         | 7-71.1  | 2-      |         |         | 2-72.1  | 2-14.0  | 2-13.9  |
|         |         |         | 7-28.9  |         |         | 3- 0.5  | 3-77.5  | 3-22.0  |
|         |         |         |         |         |         | 6- 0.4  | 6- 8.5  | 6-62.2  |
+-----------------------------+-----------------------------+-----------------------------+
| 6-55.0  | 6- 9.9  | 6- 6.3  | 1-14.1  | 1- 6.8  | 1- 8.0  | 1-31.2  | 1-26.3  | 1-13.7  |
|         |         |         |         |         |         | 2-      | 2-      |         |
| 7-27.3  | 7- 1.1  | 7- 0.5  | 2-42.1  | 2-25.1  | 2-32.7  | 4-20.0  | 4-36.3  | 4-43.9  |
| 8-14.5  | 8-57.7  | 8-27.8  | 3-25.1  | 3-43.2  | 3-31.7  | 5-42.6  | 5-24.7  | 5-32.7  |
| 9- 3.2  | 9-31.3  | 9-65.5  | 7-18.7  | 7-24.9  | 7-27.5  | 6- 6.2  | 6-13.0  | 6- 9.7  |
+-----------------------------+-----------------------------+-----------------------------+

Shows <2> as not occuring in [r2c4], yet your third possibility

Code: Select all
| 123 | 456 | 789 |
| 457 | 289 |(613)|
|(689)|(137)|(245)|

shows it there. Maybe someone will generate 10^6 puzzles with this pattern and see if the canonical form ever leaves the <2> in [r2c4].
daj95376
2014 Supporter
 
Posts: 2624
Joined: 15 May 2006

Postby JPF » Fri Jan 26, 2007 5:49 pm

daj95376 wrote:Shows <2> as not being a possible candidate in [r2c4], yet your third possibility
No, <2> is a possible candidate, but with a very low probability.
See my comments to Mike Barber above.

I have edited the table for clarification.

JPF
JPF
2017 Supporter
 
Posts: 6123
Joined: 06 December 2005
Location: Paris, France

Postby coloin » Fri Jan 26, 2007 7:48 pm

Ah I understand now [I think]

What we have is a sort of pecking order for the 6 bands -

Grids which are rare will be grids which have all 6 bands down the list.

Here is the output by the band analyer by dukuso index416 of your rarest grids
The first number is the first horizontal band:
Code: Select all
C:\suxx>index416 test4.txt
202 396 398  , 221 312 215
227 354 314  , 319 330 373
223 391 396  , 394 407 412
345 356 335  , 375 318 364
226 318 403  , 330 351 372
235 326 364  , 312 312 344
332 361 315  , 373 376 384
312 415 375  , 359 317 353
321 330 403  , 385 321 317
365 383 411  , 335 387 369
384 359 350  , 335 387 354


Im not sure what the pecking order is within the 416 bands - but it looks to roughly follow the allocated numbers - or whether there is a difference between bands with the same band number.

Yes the "pecking " order of the canonicolized grids with 17-puzzles
starts with bands
Code: Select all
  1
  2
  3
  4
  6
  5
  7
  8
 15
  9
 10
 11
 12
 13
 14
 16
 17
 18
 19
 24
..........
..........
217
227
220
231
279
370
345
235
319
333    there were 329 out of the 415 represented.


But it may not be the complete picture, having said that the rarest has to be the one at the bottom ?no

Running on your 10^6 canonicolized random grids will give the full pecking order

C
coloin
 
Posts: 2365
Joined: 05 May 2005
Location: Devon

Postby daj95376 » Fri Jan 26, 2007 8:26 pm

Okay JPF, out of 10^6 random filled grids generated from:

Code: Select all
*-----------------------*
| 1 2 3 | 4 5 6 | 7 8 9 |
| 4 5 7 | 2 8 9 | . . . |
| . . . | . . . | . . . |
|-------+-------+-------|
| 2 . . | . . . | . . . |
| . . . | . . . | . . . |
| . . . | . . . | . . . |
|-------+-------+-------|
| . . . | . . . | . . . |
| . . . | . . . | . . . |
| . . . | . . . | . . . |
*-----------------------*

There are 15 with a canonical pattern starting with 123456789_457289:

Code: Select all
123456789_457289163_689173452_245697318761328594938514627372861945596742831814935276
123456789_457289163_689713254_231564897578921346964378512345197628716842935892635471
123456789_457289163_698137524_236795418845321697971864352369572841584913276712648935
123456789_457289163_698137524_265794831379861452814523697531642978782915346946378215
123456789_457289163_698137524_284593671765841932931672458376918245512364897849725316
123456789_457289163_698317254_241538976735692418986741532362975841519864327874123695
123456789_457289163_698713254_284135697365897421719642835576321948831964572942578316
123456789_457289163_869713245_231874956746395812985162374398641527514927638672538491
123456789_457289163_869713245_238645971691837524745192638382961457516374892974528316
123456789_457289163_869731245_231564897745928316986173452374895621512647938698312574
123456789_457289163_896317245_231894657745621938968735412314572896579168324682943571
123456789_457289163_896317245_234968517579143826681572394368724951745891632912635478
123456789_457289163_896317245_239578416685124397741963852312645978574892631968731524
123456789_457289163_896317245_271693458389145672645872931564928317712534896938761524
123456789_457289613_698317245_236894157845761392971532864384625971512978436769143528

There is one example of [r2c8]=1, eleven examples of [r3c7]=2, three examples of [r3c8]=2, and one example of [r3c9]=2 -- which isn't indicated in your table.

(Mucho Thanks to gsf for posting a new solver that generates row canonical output!)
Last edited by daj95376 on Fri Jan 26, 2007 8:22 pm, edited 2 times in total.
daj95376
2014 Supporter
 
Posts: 2624
Joined: 15 May 2006

Postby JPF » Fri Jan 26, 2007 9:06 pm

coloin wrote:Running on your 10^6 canonicolized random grids will give the full pecking order
I will do it as soon I understand what index416 does.

daj95376 wrote:Okay JPF, out of 10^6 random filled grids generated from...
There were 16 with a canonical pattern starting with 123456789_457289:
There was even one example of <1> in row [r2c8].

Glad you finally understood the table related to the first band:)
Your run confirm what I previously mentionned : the possible digits [2] in r2c4,r3c7,r3c8 and [1] in r2c8, but wit low probability.

JPF
JPF
2017 Supporter
 
Posts: 6123
Joined: 06 December 2005
Location: Paris, France

Postby coloin » Fri Jan 26, 2007 10:31 pm

JPF wrote:I will do it as soon I understand what index416 does

Well the bands were normalized into 44 different gangsters by Bertram and also Red Ed. There are 6 gangsters [3 horizontal and 3 vertical 27 clue bands]in a full grid.

The 416 classification is one step up as I understand it.
44^6 = 7256313856
416^6 = 5182746699759616
We have more than enough combinations for all essentially different grids.
However its 100% accurate for excluding isomorphic grids and it describes the grid in terms of the bands. If the index416 coding is different it is a different grid.

Its use here is to tell us which is the first band which the grid has been normalized [min lex] to.

Some quotes from the past.....

dukuso wrote:well, you can test whether 2 grids are equivalent using "Nauty".
Make a graph with vertices 81cells+9rows+9columns+9blocks+9symbols
etc. assumedly nobody except Gordon understands...
I usually just only test the gangster-class of the 6 bands as invariants.

Moschopulus wrote:Since 416^6=5182746699759616, which is greater than the number of grids up to equivalence, it is possible that each grid has its own 416-gangster 6-tuple. Do you know if this is true? Are there non-equivalent grids with the same 6-tuple?

dukuso wrote:I didn't even check the 416-class, only the 44-class.
7524 from 7611 were different.
Only 80-90% of 3-tuples are possible, I can't say about 6-tuples.
Non-equivalent grids with same (416-)6-tuple ? I haven't thought
about it. Maybe someone can find an example. They must be rare.


So some of the 44^6 were the same
What of the 416^6 ?

Ah... Ive done it ..... 3 pairs of different grids out of 32247 had the same index416.
Code: Select all
 2  18   3  ,  24  24 259   - 123456789456789123789123465231594876594867312867231954315648297648972531972315648
 2  18   3  ,  24  24 259   - 123456789456789123789123465231867954594231876867594312315972648648315297972648531
 3  18  18  ,  24  24  71   - 123456789456789123789123564234567918567891432891234657345672891672918345918345276
 3  18  18  ,  24  24  71   - 123456789456789123789123564234891657567234918891567432345672891672918345918345276
 21 275 279  ,  25 390 331  - 123456789456789132789213456234895617597641328861327594348162975615978243972534861
 21 275 279  ,  25 390 331  - 123456789456789132789213456237568914591324867864971325375192648642837591918645273


So not 100% for confirming uniqueness.
I think there is a bug which fails to differentiate those repeating minirows.....almost certainly fixable !

C
coloin
 
Posts: 2365
Joined: 05 May 2005
Location: Devon

Postby JPF » Sat Jan 27, 2007 12:14 pm

daj, I read your updated previous message

daj95376 wrote:out of 10^6 random filled grids... There is one example of [r2c8]=1, eleven examples of [r3c7]=2, three examples of [r3c8]=2, and one example of [r3c9]=2 -- which isn't indicated in your table.

Code: Select all
123456789_457289163_689173452_245697318761328594938514627372861945596742831814935276

Right.
Congratulations for this [r3c9]=2, not easy to find.
I will edit my previous posts accordingly.
Thanks.

[Edit] here's an other specimen :
Code: Select all
123456789_457289163_689173452_231648975765931248894725631346892517512367894978514326

JPF
Last edited by JPF on Sat Jan 27, 2007 3:13 pm, edited 1 time in total.
JPF
2017 Supporter
 
Posts: 6123
Joined: 06 December 2005
Location: Paris, France

Postby coloin » Sat Jan 27, 2007 2:27 pm

If there can be a 2 at r2c4....why cant there be a 3 at r2c4 ?

What is the canonical representative at the bottom of the classification ?

I will pm kjellfp [enumerates the classes of the six 3*9-chutes in a sudokugrid: index416 (courtesy of Kjell Fredrik Pettersen)] to get his comments on the apparent minor discrepancy.

As it is from Kjellfp the speed is impressive...1 million grids in approx 5 seconds.

C
coloin
 
Posts: 2365
Joined: 05 May 2005
Location: Devon

Postby gsf » Sat Jan 27, 2007 6:58 pm

coloin wrote:I will pm kjellfp [enumerates the classes of the six 3*9-chutes in a sudokugrid: index416 (courtesy of Kjell Fredrik Pettersen)] to get his comments on the apparent minor discrepancy.

As it is from Kjellfp the speed is impressive...1 million grids in approx 5 seconds.

could you point me to a url or post or pm the list of 416?
thanks
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

PreviousNext

Return to General