## Latin Squares and Latin Square puzzles

For fans of Killer Sudoku, Samurai Sudoku and other variants

### Re: Latin Squares and Latin Square puzzles

Hi, coloin!
coloin wrote:
serg wrote:Remark. The last LS in minlex form ("Solitary" Mathimagics' LS grid) has minimal Index value of 6. LS having minimal index of 7 (no two-row/two-column UA sets) are impossible.

hmmm
this is our grid with the 20C ... is it not index 7 ?
Code: Select all
`123456789234567891345678912456789123567891234678912345789123456891234567912345678`

Code: Select all
`123456789234567891     Index = 7`

Unfortunately, this "back circulant latin square" has Minimal Index = 4, because r1/r4 pair contains 3 two-row U6 (each UA got sparate color):

123456789
234567891
345678912
456789123
567891234
678912345
789123456
891234567
912345678

Serg
Serg
2018 Supporter

Posts: 723
Joined: 01 June 2010
Location: Russia

### Re: Latin Squares and Latin Square puzzles

Code: Select all
`123456789231564897     Index = 4`

I see .... I'm glad you [both] are right...
coloin

Posts: 1946
Joined: 05 May 2005

### Band Classification

While Serg is working on an implementation of his Canonical Form function, I have made some progress with "band classification" (distribution by "band" = rows 1-3) using the "Minlex Index" (MI) method.

I stopped the random grid caonicalisation tests after 2 million grids (mainly because getting CF grids is so expensive), and the final sample results were:

Code: Select all
`Index 0: R2 = 214365897, grids = 1667128  78.916%, bands >=  66Index 1: R2 = 214367895, grids =  437459  20.708%, bands >= 127Index 2: R2 = 214537896, grids =    7948   0.376%, bands >=  62Index 3: R2 = 214567893, grids =       8           bands >=   2                                 -------                      ed grids = 2112543`

The use of ">=" for the band counts is just a reminder that we can't be sure that more distinct band settings won't be found.

For MI = 4, 5, 6 the results are known and complete:
Code: Select all
`Index 4: R2 = 214365897, grids =    1649,          bands  =  37Index 5: R2 = 214367895, grids =      57,          bands  =  17Index 6: R2 = 214537896, grids =       1,          bands  =   1                                 -------                      ed grids =    1707`

I wrote a program to enumerate all grids with a given MI value, and tested it on MI = 4, 5 and 6. The results confirmed that these 1707 grids are the only ED grids with MI = 4, 5, 6.

The program starts with rows 1 and 2 fixed for the given MI, then adds all possible row 3 settings that maintain the target MI. This "AddRow" function is applied recursively, and when we get to row 9 we have a complete grid with the target MI. For each complete grid we check that the transpose and conjugates don't have a smaller MI, then write those grids that pass this check to the results file.

The resulting grid set should contain only grids with the target MI - these are then converted to CF (the current choke point!) and duplicates removed, producing a final set of ED grids. The # of different R3 settings, the number of R9 grid completions, and the number of different ED grids found are shown below:

Code: Select all
`MI   R3 values    R9 completions     ED --------------------------------------- 6     167,825          400           1 5   1,163,148      357,324          57 4   1,445,805    3,514,743        1649 3   3,699,570   23,279,340          ??  (tba)`

As you can see, I have generated the full grid set for MI = 3 (my 2 million sample contained only 8 cases). CF reduction is now grinding away at this set, and this will take days, perhaps weeks, to complete, but it looks like the final ED count will be ~60K

MI = 2 enumeration seems feasible, but only for the grid generation stage - the CF reduction using my current function would take far too long.

I am thinking about a randomised version of this process that could perhaps help identify new band settings for MI = 0, 1, 2.

Mathimagics
2017 Supporter

Posts: 1602
Joined: 27 May 2015
Location: Canberra

### Re: Latin Squares and Latin Square puzzles

Hi, people!
I've just finished debugging of LS grids Minlex canonicalisation program with my "UA sets accounting" algorithm.
blue wrote:The worst case is 2 rows with 3 UA6's -- 6*3*3*3 = 162 column permutations, to check.
The factor of 6, comes from 6 ways to order the UA6's.
...
The worst case happens (many times) in the MC grid.
Each pair of rows has 3 UA6's, and as you say, the rows can be taken in either order.
The total number of checks is 6*72*162 = 69984, where the factor of 6 comes from "6 conjugates".
[ It turns out that the MC grid has 69984/3 = 23328 automorphisms. ]

Yes, MC grid is the worst case for my algorithm. It is canonocalized with speed of 88 LS/sec on my notebook (Intel Core Duo CPU).
This is MC grid Minlex form in case we are treating it as LS.
Code: Select all
`123456789231564897312645978456789123564897231645978312789123456897231564978312645`

MC grid really has a property that any pair of its rows has three U6. This is true too for all MC grid's conjugates. It turns out that it's the only ED LS having this property. So, "the worst case" LS is the only LS among 19 billions of ED LS.

But other LS are processed faster. 1707 LS, not containing U4, are processed at the speed of 2710 LS/sec. If Mathimagics could send me a bunch of random LS, I would measure canonicalisation speed for those random Latin Squares.

Serg
Serg
2018 Supporter

Posts: 723
Joined: 01 June 2010
Location: Russia

### Re: Latin Squares and Latin Square puzzles

Serg wrote:But other LS are processed faster. 1707 LS, not containing U4, are processed at the speed of 2710 LS/sec. If Mathimagics could send me a bunch of random LS, I would measure canonicalisation speed for those random Latin Squares.

Excellent work, Serg!

I will send you a big set of random LS ...

Mathimagics
2017 Supporter

Posts: 1602
Joined: 27 May 2015
Location: Canberra

### Re: Latin Squares and Latin Square puzzles

I have completed the ED reduction work for MI = 3 (Minlex index 3).

The table of known/complete results is now:

Code: Select all
`Index 3: R2 = 214567893, grids =   44638,          bands  = 138Index 4: R2 = 214365897, grids =    1649,          bands  =  37Index 5: R2 = 214367895, grids =      57,          bands  =  17Index 6: R2 = 214537896, grids =       1,          bands  =   1`

I was able to complete the ED reduction of the 23,279,340 candidate grids in relatively quick time by using an alternative CF function. This function (GCF = Graph Canonical Form) uses the "canonical graph" method. We convert a grid to a graph, and use nauty/traces software to obtain a canonical graph, then retrieve the canonical LS from that.

While not as fast as Serg's reported results, it achieved a rate of ~330 LS/sec, which was about 120 times faster than my original MCF (Minlex Canonical Form) function. I used MCF on the reduced grid set to produce the minlex ED grid set, since the GCF operation does not produce a minlex grid.

For obvious reasons, when running this generation + reduction job for the case MI = 2, I would prefer to use Serg's MCF function, when it becomes available.

Mathimagics
2017 Supporter

Posts: 1602
Joined: 27 May 2015
Location: Canberra

### Re: Latin Squares and Latin Square puzzles

Hi, all!
Mathimagics sent me a bunch of 1,115,151 randomly generated Latin Squares, and I processed all those LS by my canonicalisation program. Canonicalisation was done at the speed of about 7000 LS/sec. Cross-checks of my results and Mathimagics's results were done - all canonic forms coincided.
I checked statistics during this bunch of LS processing - how many pairs of rows must be tested when we are searching for LS Minlex Form using "UA sets accounting" method.
Code: Select all
`Minimal MI histogram 0   880162   78.9 % 1   230801   20.7 % 2     4184    0.4 % 3        4    0.0 %Minimal MI multiplicity histogram  1   301465  2   255669  3   185861  4   126836  5    84262  6    55785  7    37364  8    24887  9    16273 10    10477 11     6437 12     4138 13     2321 14     1485 15      860 16      469 17      270 18      145 19       65 20       34 21       23 22       13 23        5 24        5 25        1 26        1Processed 1115151 Latin Squares`

Minimal MI multiplicity says us - how many pairs of rows must be checked during canonicalisation. For example, 301465 LS have the only row pair providing minimal MI for those LS. You can see from the histogram that 50% of LS grids have 1 or 2 row pairs only giving minimal MI for those LS grids. This observation ensure rather high performance of considered canonicalisation algorithm.

Serg
Serg
2018 Supporter

Posts: 723
Joined: 01 June 2010
Location: Russia

### Band Distribution

Thanks to Serg's excellent work I have been able to use random LS grid generation in combination with his fast Minlex function to identify more band settings. After 7 billion grid generations, the band distribution looks like this:

Code: Select all
`Index 0: R2 = 214365897,                           bands  =  67Index 1: R2 = 214367895,                           bands  = 129Index 2: R2 = 214537896,                           bands  = 201Index 3: R2 = 214567893, grids =   44638,          bands  = 138Index 4: R2 = 214365897, grids =    1649,          bands  =  37Index 5: R2 = 214367895, grids =      57,          bands  =  17Index 6: R2 = 214537896, grids =       1,          bands  =   1                                                   ------------                                                   bands  = 590`

The random grid process has found mainly new band (R1-R3) settings for index 2, which we can now see has the most bands. Grid counts are shown for index values 3 - 6, as before, because we know all the ED grids for those bands. Index 2 shows up in the random grids 0.39% of the time, so the grid count for index 2 is probably close to 77 million.

New band finds are now very rare, so the list is very close to being complete, but I will keep the process running for now.

[EDIT #1] 08 Jan 2020: found 4 new index 2 bands, table updated

Mathimagics
2017 Supporter

Posts: 1602
Joined: 27 May 2015
Location: Canberra

### Band Distribution (final)

Band analysis has been completed, with the final steps being:

• explicit enumeration of all minlex grids for Index 2 (this found 5 additional bands)
• for Index values 0 and 1, testing all candidate bands that were not already known to have grids (from previous random grid generation). This found that there were no additional bands.

The upshot is that we now know that there are exactly 595 different minlex bands:

Code: Select all
`Index 0: R2 = 214365897,                       bands  =  67Index 1: R2 = 214367895,                       bands  = 129Index 2: R2 = 214537896, grids = 66,769,988,   bands  = 206Index 3: R2 = 214567893, grids =     44,638,   bands  = 138Index 4: R2 = 214365897, grids =      1,649,   bands  =  37Index 5: R2 = 214367895, grids =         57,   bands  =  17Index 6: R2 = 214537896, grids =          1,   bands  =   1                                               ------------                                               bands  = 595`

Using an 80/20 split ratio, the grid count for index 0 is estimated at 15.36 billion, and for index 1 the estimate is 3.84 billion.

Thanks once again to Serg for his excellent coding of the LSMIN and LRMIN functions!

Mathimagics
2017 Supporter

Posts: 1602
Joined: 27 May 2015
Location: Canberra

### Re: Band Distribution (final)

Hi, Mathimagics!
Mathimagics wrote:Band analysis has been completed, with the final steps being:

• explicit enumeration of all minlex grids for Index 2 (this found 5 additional bands)
• for Index values 0 and 1, testing all candidate bands that were not already known to have grids (from previous random grid generation). This found that there were no additional bands.

The upshot is that we now know that there are exactly 595 different minlex bands:

Code: Select all
`Index 0: R2 = 214365897,                       bands  =  67Index 1: R2 = 214367895,                       bands  = 129Index 2: R2 = 214537896, grids = 66,769,988,   bands  = 206Index 3: R2 = 214567893, grids =     44,638,   bands  = 138Index 4: R2 = 214365897, grids =      1,649,   bands  =  37Index 5: R2 = 214367895, grids =         57,   bands  =  17Index 6: R2 = 214537896, grids =          1,   bands  =   1                                               ------------                                               bands  = 595`

Using an 80/20 split ratio, the grid count for index 0 is estimated at 15.36 billion, and for index 1 the estimate is 3.84 billion.

Surprising results! Well done!
Mathimagics wrote:for Index values 0 and 1, testing all candidate bands that were not already known to have grids (from previous random grid generation). This found that there were no additional bands.

I don't understand in what way you got exact number of bands from random search.

I am posting just another surprising data (to confuse everyone completely).
I calculated probabilities of all 8 MI (0,1,...,7) distribution for random pair of rows. Both rows in pair are random, but compatible to form r1/r2 rows of 9 x 9 LS. To generate such random pair of rows I used your file with 1,115,151 randomly generated 9 x 9 LS. I used r1 and r2 pair of rows only from each LS. Result is rather surprising.
Code: Select all
`Minimality Index for r1/r2 rows pair histogramMI   r1/r2 pairs 0    20,553 1    75,201 2   126,447 3   216,206 4    19,098 5   168,702 6   151,584 7   337,360Totally 1,115,151 random pairs`

So, you can see that rows pair with MI = 7 (no two-row UA) is the most probable event!
Why do we observe LS with minimal MI = 0 with probability about 80%?
The probability of having MI = 0 is 1.84 % only. But we are checking 108 pairs of rows (36 pairs x 3 paratopic transformations). Probability of MI > 0 is 98.16 % for one pair of rows. But probability of (rare) event, that all 108 pairs have MI > 0 is 13.5 %. So, probability, that at least one pair from 108 pairs has MI = 0 is about 86.5 %. These calculations are not exact, they are based on Monte-Carlo generation data.

Serg

[Edited. I corrected a typo.]
Serg
2018 Supporter

Posts: 723
Joined: 01 June 2010
Location: Russia

### Re: Latin Squares and Latin Square puzzles

Hi Serg,

How interesting is that MI distribution!

I thought that one way to cross-check your results would be to simply check all 9-permutations. With R1 = 123456789, and R2 = each permutation that doesn't have a fixed point (Cn = n), calculate the MI value. There are 133,496 possible R2 settings, and the distribution by MI (%) looks very much like yours:

Code: Select all
`MI = 0:   2520     1.89%MI = 1:   9072     6.80%MI = 2:  15120    11.33%MI = 3:  25920    19.42%MI = 4:   2240     1.68%MI = 5:  20160    15.10%MI = 6:  18144    13.59%MI = 7:  40320    30.20%        ------                        133496`

I will give details of my assertion (that the band counts for MI = 0, 1 are known and complete) in a separate post.

Mathimagics
2017 Supporter

Posts: 1602
Joined: 27 May 2015
Location: Canberra

### Re: Latin Squares and Latin Square puzzles

Serg wrote:I don't understand in what way you got exact number of bands from random search.

A quick review of the methodology:

• LSMIN function returns minlex canonical form for a complete grid.
• LRMIN function returns "relative minlex form" for a partial (Latin Rectangle) grid, given the first NR rows. Paratopic transformations are not considered.

For each MI value (0 to 6), we can use LRMIN to generate a list of "candidate bands". We fix R1 and R2 for the MI index value, and for each possible R3 setting we add the resulting band B = R1+R2+R3 to the list of candidates if, and only if, LRMIN(B) = B.

We can eliminate additional candidates by checking X = (R, D, C) conjugate of B. If LRMIN(X) < B we can eliminate B.

The table below shows the number of candidate bands for each MI.

Code: Select all
`     Candidate   Known   MI    Bands     Bands       ED grids ------------------------------------  0      104       67  1      281      129  2      364      206      66,769,988  3      438      138          44,638  4       54       37           1,649  5       99       17              57  6       58        1               1`

The known bands and ED grid counts for MI = 2 to 7 were obtained from the ENUMBAND program. The program starts with a given band (rows 1-3), then adds all possible row 4 settings that maintain the relationship LRMIN(X) = X, where X = first 4 rows.

This "AddRow" process is applied recursively, and when we get to row 9 we have a complete grid G, and if LSMIN(G) = G we add this grid to the results list.

For MI = 0, and MI = 1, the "known bands" are those obtained from previous random grid sampling. For each candidate band that is not in the known band list, we used ENUMBAND (modified to return only the first grid found). No grids were found for any of these candidates.

Mathimagics
2017 Supporter

Posts: 1602
Joined: 27 May 2015
Location: Canberra

### Re: Band Distribution (final)

Serg wrote:I am posting just another surprising data (to confuse everyone completely).

This result is the least confusing.

Serg wrote:So, you can see that rows pair with MI = 7 (no two-row UA) is the most probable event!

Is this phenomenon similar to this observation for sudoku grids in Minlex Form: Min and Max Lists / Chaining?
Michael A. Deverin wrote:In just over 70% of the cases, the lowest ranking pair was of rank = 4. This is especially fortuitous, since there is only one way to convert the row pair to minlex form.
dobrichev
2016 Supporter

Posts: 1791
Joined: 24 May 2010

### Re: Band Distribution (final)

dobrichev wrote:
Serg wrote:I am posting just another surprising data (to confuse everyone completely).

This result is the least confusing.

I am totally confused.
I was sure that my idea of LS canonicalisation speeding up is original, but it turns out, it is just another application of Michael Deverin's (aka holdout) ideas. I cannot say even that I was not aware of his ideas, because I participated the thread you cited. I forgot Michael's ideas completely! At that time (in 2011) I treated his findnings as curious observations, but didn't understand their practical sense.
The good news is that I understand now some basical principles of Michael Deverin's Sudoku canonicalisation algorithm.
dobrichev wrote:
Serg wrote:So, you can see that rows pair with MI = 7 (no two-row UA) is the most probable event!

Is this phenomenon similar to this observation for sudoku grids in Minlex Form: Min and Max Lists / Chaining?

I need some time to understand it.

Serg
Serg
2018 Supporter

Posts: 723
Joined: 01 June 2010
Location: Russia

### Re: Latin Squares and Latin Square puzzles

Hi, Mathimagics!
Mathimagics wrote:There are 133,496 possible R2 settings, and the distribution by MI (%) looks very much like yours:

Code: Select all
`MI = 0:   2520     1.89%MI = 1:   9072     6.80%MI = 2:  15120    11.33%MI = 3:  25920    19.42%MI = 4:   2240     1.68%MI = 5:  20160    15.10%MI = 6:  18144    13.59%MI = 7:  40320    30.20%        ------                        133496`

Good point! Your calculation is exact and is very similar to my Monte-Carlo simulation.

Serg
Serg
2018 Supporter

Posts: 723
Joined: 01 June 2010
Location: Russia

PreviousNext