The New Sudoku Players' Forum

by **Serg** » Mon Dec 16, 2019 9:22 pm

Hi, coloin!

coloin wrote:
serg wrote:Remark. The last LS in minlex form ("Solitary" Mathimagics' LS grid) has minimal Index value of 6. LS having minimal index of 7 (no two-row/two-column UA sets) are impossible.

hmmm
this is our grid with the 20C ... is it not index 7 ?
Code: Select all
123456789 234567891 345678912 456789123 567891234 678912345 789123456 891234567 912345678

Code: Select all
123456789 234567891 Index = 7

Unfortunately, this "back circulant latin square" has Minimal Index = 4, because r1/r4 pair contains 3 two-row U6 (each UA got sparate color):

123456789
234567891
345678912
456789123
567891234
678912345
789123456
891234567
912345678

Serg

by **coloin** » Mon Dec 16, 2019 9:57 pm

Code: Select all: 123456789 231564897 Index = 4

I see .... I'm glad you [both] are right...

by **Mathimagics** » Sun Dec 22, 2019 8:33 am

While Serg is working on an implementation of his Canonical Form function, I have made some progress with "band classification" (distribution by "band" = rows 1-3) using the "Minlex Index" (MI) method.

I stopped the random grid caonicalisation tests after 2 million grids (mainly because getting CF grids is so expensive), and the final sample results were:

Code: Select all: Index 0: R2 = 214365897, grids = 1667128 78.916%, bands >= 66 Index 1: R2 = 214367895, grids = 437459 20.708%, bands >= 127 Index 2: R2 = 214537896, grids = 7948 0.376%, bands >= 62 Index 3: R2 = 214567893, grids = 8 bands >= 2 ------- ed grids = 2112543

The use of ">=" for the band counts is just a reminder that we can't be sure that more distinct band settings won't be found.

For MI = 4, 5, 6 the results are known and complete:

Code: Select all: Index 4: R2 = 214365897, grids = 1649, bands = 37 Index 5: R2 = 214367895, grids = 57, bands = 17 Index 6: R2 = 214537896, grids = 1, bands = 1 ------- ed grids = 1707

I wrote a program to enumerate all grids with a given MI value, and tested it on MI = 4, 5 and 6. The results confirmed that these 1707 grids are the only ED grids with MI = 4, 5, 6.

The program starts with rows 1 and 2 fixed for the given MI, then adds all possible row 3 settings that maintain the target MI. This "AddRow" function is applied recursively, and when we get to row 9 we have a complete grid with the target MI. For each complete grid we check that the transpose and conjugates don't have a smaller MI, then write those grids that pass this check to the results file.

The resulting grid set should contain only grids with the target MI - these are then converted to CF (the current choke point!) and duplicates removed, producing a final set of ED grids. The # of different R3 settings, the number of R9 grid completions, and the number of different ED grids found are shown below:

Code: Select all: MI R3 values R9 completions ED --------------------------------------- 6 167,825 400 1 5 1,163,148 357,324 57 4 1,445,805 3,514,743 1649 3 3,699,570 23,279,340 ?? (tba)

As you can see, I have generated the full grid set for MI = 3 (my 2 million sample contained only 8 cases). CF reduction is now grinding away at this set, and this will take days, perhaps weeks, to complete, but it looks like the final ED count will be ~60K

MI = 2 enumeration seems feasible, but only for the grid generation stage - the CF reduction using my current function would take far too long.

I am thinking about a randomised version of this process that could perhaps help identify new band settings for MI = 0, 1, 2.

by **Serg** » Sat Dec 28, 2019 12:05 am

Hi, people!
I've just finished debugging of LS grids Minlex canonicalisation program with my "UA sets accounting" algorithm.
As blue pointed above in this thread (saying about my algorithm):

blue wrote:The worst case is 2 rows with 3 UA6's -- 6*3*3*3 = 162 column permutations, to check.
The factor of 6, comes from 6 ways to order the UA6's.
...
The worst case happens (many times) in the MC grid.
Each pair of rows has 3 UA6's, and as you say, the rows can be taken in either order.
The total number of checks is 6*72*162 = 69984, where the factor of 6 comes from "6 conjugates".
[ It turns out that the MC grid has 69984/3 = 23328 automorphisms. ]

Yes, MC grid is the worst case for my algorithm. It is canonocalized with speed of 88 LS/sec on my notebook (Intel Core Duo CPU).
This is MC grid Minlex form in case we are treating it as LS.

Code: Select all: 123456789 231564897 312645978 456789123 564897231 645978312 789123456 897231564 978312645

MC grid really has a property that any pair of its rows has three U6. This is true too for all MC grid's conjugates. It turns out that it's the only ED LS having this property. So, "the worst case" LS is the only LS among 19 billions of ED LS.

But other LS are processed faster. 1707 LS, not containing U4, are processed at the speed of 2710 LS/sec. If Mathimagics could send me a bunch of random LS, I would measure canonicalisation speed for those random Latin Squares.

Serg

by **Mathimagics** » Sat Dec 28, 2019 5:32 am

Serg wrote:But other LS are processed faster. 1707 LS, not containing U4, are processed at the speed of 2710 LS/sec. If Mathimagics could send me a bunch of random LS, I would measure canonicalisation speed for those random Latin Squares.

Excellent work, Serg! 8-)

I will send you a big set of random LS ...

by **Mathimagics** » Sat Dec 28, 2019 8:37 am

I have completed the ED reduction work for MI = 3 (Minlex index 3).

The table of known/complete results is now:

Code: Select all: Index 3: R2 = 214567893, grids = 44638, bands = 138 Index 4: R2 = 214365897, grids = 1649, bands = 37 Index 5: R2 = 214367895, grids = 57, bands = 17 Index 6: R2 = 214537896, grids = 1, bands = 1

I was able to complete the ED reduction of the 23,279,340 candidate grids in relatively quick time by using an alternative CF function. This function (GCF = Graph Canonical Form) uses the "canonical graph" method. We convert a grid to a graph, and use nauty/traces software to obtain a canonical graph, then retrieve the canonical LS from that.

While not as fast as Serg's reported results, it achieved a rate of ~330 LS/sec, which was about 120 times faster than my original MCF (Minlex Canonical Form) function. I used MCF on the reduced grid set to produce the minlex ED grid set, since the GCF operation does not produce a minlex grid.

For obvious reasons, when running this generation + reduction job for the case MI = 2, I would prefer to use Serg's MCF function, when it becomes available.

by **Serg** » Sat Dec 28, 2019 9:24 pm

Hi, all!
Mathimagics sent me a bunch of 1,115,151 randomly generated Latin Squares, and I processed all those LS by my canonicalisation program. Canonicalisation was done at the speed of about 7000 LS/sec. Cross-checks of my results and Mathimagics's results were done - all canonic forms coincided.
I checked statistics during this bunch of LS processing - how many pairs of rows must be tested when we are searching for LS Minlex Form using "UA sets accounting" method.

Code: Select all: Minimal MI histogram 0 880162 78.9 % 1 230801 20.7 % 2 4184 0.4 % 3 4 0.0 % Minimal MI multiplicity histogram 1 301465 2 255669 3 185861 4 126836 5 84262 6 55785 7 37364 8 24887 9 16273 10 10477 11 6437 12 4138 13 2321 14 1485 15 860 16 469 17 270 18 145 19 65 20 34 21 23 22 13 23 5 24 5 25 1 26 1 Processed 1115151 Latin Squares

Minimal MI multiplicity says us - how many pairs of rows must be checked during canonicalisation. For example, 301465 LS have the only row pair providing minimal MI for those LS. You can see from the histogram that 50% of LS grids have 1 or 2 row pairs only giving minimal MI for those LS grids. This observation ensure rather high performance of considered canonicalisation algorithm.

Serg

by **Mathimagics** » Sat Jan 04, 2020 8:08 am

Thanks to Serg's excellent work I have been able to use random LS grid generation in combination with his fast Minlex function to identify more band settings. After 7 billion grid generations, the band distribution looks like this:

Code: Select all: Index 0: R2 = 214365897, bands = 67 Index 1: R2 = 214367895, bands = 129 Index 2: R2 = 214537896, bands = 201 Index 3: R2 = 214567893, grids = 44638, bands = 138 Index 4: R2 = 214365897, grids = 1649, bands = 37 Index 5: R2 = 214367895, grids = 57, bands = 17 Index 6: R2 = 214537896, grids = 1, bands = 1 ------------ bands = 590

The random grid process has found mainly new band (R1-R3) settings for index 2, which we can now see has the most bands. Grid counts are shown for index values 3 - 6, as before, because we know all the ED grids for those bands. Index 2 shows up in the random grids 0.39% of the time, so the grid count for index 2 is probably close to 77 million.

New band finds are now very rare, so the list is very close to being complete, but I will keep the process running for now.

[EDIT #1] 08 Jan 2020: found 4 new index 2 bands, table updated

by **Mathimagics** » Mon Jan 13, 2020 9:25 pm

Band analysis has been completed, with the final steps being:

explicit enumeration of all minlex grids for Index 2 (this found 5 additional bands)

for Index values 0 and 1, testing all candidate bands that were not already known to have grids (from previous random grid generation). This found that there were no additional bands.

The upshot is that we now know that there are exactly 595 different minlex bands:

Code: Select all: Index 0: R2 = 214365897, bands = 67 Index 1: R2 = 214367895, bands = 129 Index 2: R2 = 214537896, grids = 66,769,988, bands = 206 Index 3: R2 = 214567893, grids = 44,638, bands = 138 Index 4: R2 = 214365897, grids = 1,649, bands = 37 Index 5: R2 = 214367895, grids = 57, bands = 17 Index 6: R2 = 214537896, grids = 1, bands = 1 ------------ bands = 595

Using an 80/20 split ratio, the grid count for index 0 is estimated at 15.36 billion, and for index 1 the estimate is 3.84 billion.

Thanks once again to Serg for his excellent coding of the LSMIN and LRMIN functions! 8-)

by **Serg** » Tue Jan 14, 2020 10:17 pm

Hi, Mathimagics!

Mathimagics wrote:Band analysis has been completed, with the final steps being:

explicit enumeration of all minlex grids for Index 2 (this found 5 additional bands)
for Index values 0 and 1, testing all candidate bands that were not already known to have grids (from previous random grid generation). This found that there were no additional bands.

The upshot is that we now know that there are exactly 595 different minlex bands:

Code: Select all
Index 0: R2 = 214365897, bands = 67 Index 1: R2 = 214367895, bands = 129 Index 2: R2 = 214537896, grids = 66,769,988, bands = 206 Index 3: R2 = 214567893, grids = 44,638, bands = 138 Index 4: R2 = 214365897, grids = 1,649, bands = 37 Index 5: R2 = 214367895, grids = 57, bands = 17 Index 6: R2 = 214537896, grids = 1, bands = 1 ------------ bands = 595

Using an 80/20 split ratio, the grid count for index 0 is estimated at 15.36 billion, and for index 1 the estimate is 3.84 billion.

Surprising results! Well done!
Please, describe more detailed your phrase:

Mathimagics wrote:for Index values 0 and 1, testing all candidate bands that were not already known to have grids (from previous random grid generation). This found that there were no additional bands.

I don't understand in what way you got exact number of bands from random search.

I am posting just another surprising data (to confuse everyone completely).
I calculated probabilities of all 8 MI (0,1,...,7) distribution for random pair of rows. Both rows in pair are random, but compatible to form r1/r2 rows of 9 x 9 LS. To generate such random pair of rows I used your file with 1,115,151 randomly generated 9 x 9 LS. I used r1 and r2 pair of rows only from each LS. Result is rather surprising.

Code: Select all: Minimality Index for r1/r2 rows pair histogram MI r1/r2 pairs 0 20,553 1 75,201 2 126,447 3 216,206 4 19,098 5 168,702 6 151,584 7 337,360 Totally 1,115,151 random pairs

So, you can see that rows pair with MI = 7 (no two-row UA) is the most probable event!
Why do we observe LS with minimal MI = 0 with probability about 80%?
The probability of having MI = 0 is 1.84 % only. But we are checking 108 pairs of rows (36 pairs x 3 paratopic transformations). Probability of MI > 0 is 98.16 % for one pair of rows. But probability of (rare) event, that all 108 pairs have MI > 0 is 13.5 %. So, probability, that at least one pair from 108 pairs has MI = 0 is about 86.5 %. These calculations are not exact, they are based on Monte-Carlo generation data.

Serg

[Edited. I corrected a typo.]

by **Mathimagics** » Wed Jan 15, 2020 5:12 am

Hi Serg,

How interesting is that MI distribution!

I thought that one way to cross-check your results would be to simply check all 9-permutations. With R1 = 123456789, and R2 = each permutation that doesn't have a fixed point (Cn = n), calculate the MI value. There are 133,496 possible R2 settings, and the distribution by MI (%) looks very much like yours:

Code: Select all: MI = 0: 2520 1.89% MI = 1: 9072 6.80% MI = 2: 15120 11.33% MI = 3: 25920 19.42% MI = 4: 2240 1.68% MI = 5: 20160 15.10% MI = 6: 18144 13.59% MI = 7: 40320 30.20% ------ 133496

I will give details of my assertion (that the band counts for MI = 0, 1 are known and complete) in a separate post.

by **Mathimagics** » Wed Jan 15, 2020 6:21 am

Serg wrote:I don't understand in what way you got exact number of bands from random search.

A quick review of the methodology:

LSMIN function returns minlex canonical form for a complete grid.

LRMIN function returns "relative minlex form" for a partial (Latin Rectangle) grid, given the first NR rows. Paratopic transformations are not considered.

For each MI value (0 to 6), we can use LRMIN to generate a list of "candidate bands". We fix R1 and R2 for the MI index value, and for each possible R3 setting we add the resulting band B = R1+R2+R3 to the list of candidates if, and only if, LRMIN(B) = B.

We can eliminate additional candidates by checking X = (R, D, C) conjugate of B. If LRMIN(X) < B we can eliminate B.

The table below shows the number of candidate bands for each MI.

Code: Select all: Candidate Known MI Bands Bands ED grids ------------------------------------ 0 104 67 1 281 129 2 364 206 66,769,988 3 438 138 44,638 4 54 37 1,649 5 99 17 57 6 58 1 1

The known bands and ED grid counts for MI = 2 to 7 were obtained from the ENUMBAND program. The program starts with a given band (rows 1-3), then adds all possible row 4 settings that maintain the relationship LRMIN(X) = X, where X = first 4 rows.

This "AddRow" process is applied recursively, and when we get to row 9 we have a complete grid G, and if LSMIN(G) = G we add this grid to the results list.

For MI = 0, and MI = 1, the "known bands" are those obtained from previous random grid sampling. For each candidate band that is not in the known band list, we used ENUMBAND (modified to return only the first grid found). No grids were found for any of these candidates.

by **dobrichev** » Wed Jan 15, 2020 6:36 am

Serg wrote:I am posting just another surprising data (to confuse everyone completely).

This result is the least confusing.

Serg wrote:So, you can see that rows pair with MI = 7 (no two-row UA) is the most probable event!

Is this phenomenon similar to this observation for sudoku grids in Minlex Form: Min and Max Lists / Chaining?

Michael A. Deverin wrote:In just over 70% of the cases, the lowest ranking pair was of rank = 4. This is especially fortuitous, since there is only one way to convert the row pair to minlex form.

by **Serg** » Wed Jan 15, 2020 11:08 am

Hi, Mladen!

dobrichev wrote:
Serg wrote:I am posting just another surprising data (to confuse everyone completely).

This result is the least confusing.

I am totally confused. :oops:

I was sure that my idea of LS canonicalisation speeding up is original, but it turns out, it is just another application of Michael Deverin's (aka holdout) ideas. I cannot say even that I was not aware of his ideas, because I participated the thread you cited. I forgot Michael's ideas completely! At that time (in 2011) I treated his findnings as curious observations, but didn't understand their practical sense.
The good news is that I understand now some basical principles of Michael Deverin's Sudoku canonicalisation algorithm.

dobrichev wrote:
Serg wrote:So, you can see that rows pair with MI = 7 (no two-row UA) is the most probable event!

Is this phenomenon similar to this observation for sudoku grids in Minlex Form: Min and Max Lists / Chaining?

I need some time to understand it.

Serg

by **Serg** » Wed Jan 15, 2020 11:45 pm

Hi, Mathimagics!

Mathimagics wrote:There are 133,496 possible R2 settings, and the distribution by MI (%) looks very much like yours:

Code: Select all
MI = 0: 2520 1.89% MI = 1: 9072 6.80% MI = 2: 15120 11.33% MI = 3: 25920 19.42% MI = 4: 2240 1.68% MI = 5: 20160 15.10% MI = 6: 18144 13.59% MI = 7: 40320 30.20% ------ 133496

Good point! Your calculation is exact and is very similar to my Monte-Carlo simulation.

Serg

The New Sudoku Players' Forum

Latin Squares and Latin Square puzzles

Re: Latin Squares and Latin Square puzzles

Re: Latin Squares and Latin Square puzzles

Band Classification

Re: Latin Squares and Latin Square puzzles

Re: Latin Squares and Latin Square puzzles

Re: Latin Squares and Latin Square puzzles

Re: Latin Squares and Latin Square puzzles

Band Distribution

Band Distribution (final)

Re: Band Distribution (final)

Re: Latin Squares and Latin Square puzzles

Re: Latin Squares and Latin Square puzzles

Re: Band Distribution (final)

Re: Band Distribution (final)

Re: Latin Squares and Latin Square puzzles