The New Sudoku Players' Forum

by **Red Ed** » Sat Dec 16, 2006 10:11 am

gsf wrote:if you just want a stream of grids try
Code: Select all
sudoku -g -T0x80 -f%v

Much better: 10.68.

Interesting that gsf's program has a very slight preference for creating size-4 unavoidables in near the top left of the grid, over creating them near the bottom right. But with speed like that, who's complaining? ( Actually, for the purposes of this thread, I am

)

by **gsf** » Sat Dec 16, 2006 11:33 am

Red Ed wrote:
gsf wrote:if you just want a stream of grids try
Code: Select all
sudoku -g -T0x80 -f%v
Much better: 10.68.

Interesting that gsf's program has a very slight preference for creating size-4 unavoidables in near the top left of the grid, over creating them near the bottom right. But with speed like that, who's complaining? ( Actually, for the purposes of this thread, I am )

the generator works in two stages
first stage generates a valid solution grid (81 clues)
it starts with a full pencilmark grid and selects a random cell with >1 candidates
and then selects a random candidate in that cell and makes it a clue
pencilmarks are propagated and invalid grids are tossed
at the 27th clue assignment the grid is solved with unique solution check enabled
unique solutions are sent to the second stage, otherwise the first stage is restarted

second stage fills an empty grid using clues from the solution grid
no validation is needed because the clues are taken from a valid solution grid
after 16 clues are filled (might as well do a < 17 clue stab) the singles backtrack solver
is used to check for unique solution

so at least some of the bias is in stage 1, in at least 2 places
the magic "27" was chosen to minimize the number of (heavyweight) solver calls
vs the number of (lightweight and amortized) pencilmark validations

its also goofy to throw out multi-solution puzzles in a solution grid search
since two valid solution grids are tossed -- it should take the first solution
as a gift and pass that on to the second stage
I'll rectify that but I won;t be able to update www until monday

as for upper-left vs lower right bias, isn't that an automorphism wash?
i.e., the grids could be randomly permuted to eliminate that, no?

the "random" selections are based on a 64 bit PRNG (cycle 2^64)

I do have a question on the difference between -T0x80 and not
Red Ed, are you measuring bias on solution grids or puzzle grids?
I had assumed you were measuring solution grid bias
if so then here's and experiment for you to crunch:

with -T0x80 valid solution grids (no empty cells) are generated
without -T0x80 valid puzzles (with empty cells) are generated
if you make two runs with the same PRNG seed N (-rN)
the non-T0x80 run will have the same solution grids as the -T0x80 run
(you can use -nN to limit the run to N puzzles)

by **Red Ed** » Sat Dec 16, 2006 10:39 pm

gsf wrote:with -T0x80 valid solution grids (no empty cells) are generated
without -T0x80 valid puzzles (with empty cells) are generated

Ah, I can't believe I didn't notice all the empty cells (d'oh!) ... so my originally-reported "60" for your generator is completely bogus; sorry about that!

Thanks for the explanation of how your grid generator works. It's just stage one that I'm interested in, as I'm looking at bias in solution grids rather than puzzles. From the point of view of bias, I expect that your existing strategy is better than your proposal to accept the first of multiple solutions when the 27 clues are not uniquely solvable. I guess I'll test it if/when you make the change.

Back to size-4 unavoidables: the average number is exactly 320784513273 / 27704267971, or roughly 11.58 per grid.

I can get precise probabilities / expected values for any pattern that can be written into >1 box of a single band. Here are just a few examples that give gsf's program a little trouble. Sorry, gsf, I am not picking on you! -- it's just that your generator is much faster and easier to test than my own.

Code: Select all: { 1,0,0, 2,0,0, 0,0,0 }, /* 8.535 per grid */ { 0,2,0, 3,0,0, 0,0,0 }, { 0,0,3, 1,0,0, 0,0,0 } { 1,2,3, 0,0,0, 0,0,0 }, /* 0.647 per grid */ { 0,0,0, 1,2,3, 0,0,0 }, { 0,0,0, 0,0,0, 1,2,3 } { 1,2,0, 0,0,0, 0,0,0 }, /* 3.889 per grid */ { 3,4,0, 1,2,0, 0,0,0 }, { 0,0,0, 3,4,0, 0,0,0 } { 1,2,0, 3,4,0, 0,0,0 }, /* 7.710 per grid */ { 3,4,0, 1,0,0, 0,0,0 }, { 0,0,0, 2,0,0, 0,0,0 }

But now I've had a better idea. Perhaps the right way to do this is just to count how many of each of the 416 band-isomorphism classes are represented in each grid and report a chi^2 statistic. Right, give me a few days to find time to code that up ...

by **gsf** » Sat Dec 16, 2006 11:20 pm

Red Ed wrote:
gsf wrote:with -T0x80 valid solution grids (no empty cells) are generated
without -T0x80 valid puzzles (with empty cells) are generated
Ah, I can't believe I didn't notice all the empty cells (d'oh!) ... so my originally-reported "60" for your generator is completely bogus; sorry about that!

with so many bits bands pnrgs flying around its easy to miss anything
such as ...

Red Ed wrote:From the point of view of bias, I expect that your existing strategy is better than your proposal to accept the first of multiple solutions when the 27 clues are not uniquely solvable.

on closer code inspection it turns out the first solution is chosen
also note that <27 of those clues may lead to exactly 1 solution, I just don't check until 27
27 based on epirical data from a few random runs

so are your bias calculations positional or pure counting?
if positional then would a random permutation of a grid stream change the stats?

Red Ed wrote:Sorry, gsf, I am not picking on you!

pleased to be picked on
some of the best sw/algs come from a meeting of minds

by **JPF** » Sun Dec 17, 2006 10:26 pm

Red Ed wrote:Back to size-4 unavoidables: the average number is exactly 320784513273 / 27704267971, or roughly 11.58 per grid.

Excellent , but where are these numbers coming from ?
Any clues to understand how you calculated this exact average number ?

What is the real maximum of X ?
after 3.8 x 10^6 simulations, I got Max (X) = 29 ; p= 0.5E-06
(My initial Max (X) = 36 was incorrect due to the bug).

The relative error with the average given by my biased grids (10.99) is -5%.
It's still not clear to me what is due to the numbers generator itself and what is due to the way I'm building random grids...

JPF

by **Red Ed** » Mon Dec 18, 2006 2:51 pm

JPF wrote:
Red Ed wrote:Back to size-4 unavoidables: the average number is exactly 320784513273 / 27704267971, or roughly 11.58 per grid.

Excellent , but where are these numbers coming from ?

Let p(X,g) be the probability of finding X in a particular (fixed) position within a band represented by "gangster" g. These aren't too hard to work out. Then p(X), the probability of finding X in a particular position in any grid, is just the average of these, weighting p(X,g) proportional to the number of extensions of gangster g to a full grid (i.e. not quite uniformly). You'll recognise the denominator, 27704267971, as the big prime in the 6.67e21 number. Finally, E(X) = 486 * p(X).

I don't think I was right to suggest (for the purposes of identifying bias in solution grid generators) doing a chi^2 stat on the instance counts of band-isomorphism classes, for two reasons: (1) some of those instance counts will be very low, which is bad for chi^2, (2) to my taste, bias in smaller clue patterns is somehow more meaningful than bias in big clue patterns, i.e. whole bands.

So, I'm going to go back to testing small clue patterns that fit into a single band. I can do this extremely efficiently now that I've written a function to map a band to its isomorphism class index (0 to 415). But one thing I need to do first is make a catalogue of all essentially-different single-band locally minimal patterns on <13 clues (say). By locally minimal, I mean that no clue is implied by the others. A necessary condition for local minimality is that no cell that contains the only instance of some value should be able to "see" cells that contain all other values in the pattern. So, for example, this isn't locally minimal ...

Code: Select all: . . 1|. 2 .|. . . . . 3|. . .|. . . . . .|. . .|. 2 .

... because the '1' cell can see a '2' cell and a '3' cell. If I get too many locally minimal patterns then I will need to find a way of choosing the "best" ones, e.g. those with high variance across the bands. This is all unknown territory to me and, whilst I know the road isn't leading anywhere very interesting, the journey is fun!

by **gsf** » Tue Dec 19, 2006 5:12 am

Red Ed wrote:Thanks for the explanation of how your grid generator works. It's just stage one that I'm interested in, as I'm looking at bias in solution grids rather than puzzles. From the point of view of bias, I expect that your existing strategy is better than your proposal to accept the first of multiple solutions when the 27 clues are not uniquely solvable. I guess I'll test it if/when you make the change.

I've posted new executables
I changed the -g option to take an optional value so that all generation parameters will be in one spot
see if this provides a better set of N solution grids

Code: Select all: sudoku -gs -f%v -nN

(-gs to generate solution grids)

by **coloin** » Tue Dec 19, 2006 11:19 pm

Red Ed wrote:Back to size-4 unavoidables: the average number is exactly 320784513273 / 27704267971, or roughly 11.58 per grid.

Nice to get an exact number for once !

JPF wrote:What is the real maximum of X ?
after 3.8 x 10^6 simulations, I got Max (X) = 29 ; p= 0.5E-06

Ah hah - I think you missed the thread with RW Structures of the solution grid [page 4 onwards]

You did well to get a 29 though.....possibly worth posting for RW and myself to analyse.

You already know the grids ! - selected previously because of their properties

Max X = 30, the Pt [Platinum trellis], [78 2-digit unavoidables, atomic number of Platinum =78] [There can theroretically be no more than 30 - EDIT -not true]
Min X = 0 - grids RV et al [inc SFB]..... [also MC by default]

Red Ed cleverly teased these grids out of hiding !

C

by **JPF** » Tue Dec 19, 2006 11:41 pm

coloin wrote:
JPF wrote:What is the real maximum of X ?
after 3.8 x 10^6 simulations, I got Max (X) = 29 ; p= 0.5E-06

Ah hah - I think you missed the thread with RW Structures of the solution grid

You did well to get a 29 though.....possibly worth posting for RW and myself to analyse.

Right

Here are 3 grids with 29 size-4 unavoidables :

Code: Select all: 243658791587129346169473528921364857678592134354781962412937685896215473735846219 586324179149678523273159864418732956967815432352946718695287341834561297721493685 246193857951487236387652491763918542592346718814725369625839174478561923139274685

I got 536 grids with no size-4 unavoidable.

JPF

by **coloin** » Wed Dec 20, 2006 12:08 am

coloin wrote:[There can theroretically be no more than 30]

A quick double correction here [I know we are off topic]

Our chameleon favorite the dukuso15

Code: Select all: 123568479 864791352 957243681 218657934 536489127 749312865 391825746 472136598 685974213 {11,12,41,42,} {26,29,34,39,65,66,73,75,83,84,} {57,58,97,98,} {11,13,71,73,} {26,27,36,39,42,48,52,57,98,99,} {64,65,84,85,} {11,16,21,26,} {38,39,42,43,73,74,84,89,92,98,} {55,57,65,67,} {11,19,31,39,} {25,26,42,47,56,57,63,65,72,73,} {84,88,94,98,} {12,13,41,48,52,58,71,75,83,85,} {27,29,97,99,} {34,36,64,66,} {12,14,32,34,} {28,29,41,45,51,58,66,69,75,76,} {83,87,93,97,} {12,18,24,29,33,34,58,59,82,83,} {41,46,61,66,} {75,77,95,97,} {13,17,23,27,} {35,36,48,49,71,78,81,85,96,99,} {52,54,62,64,} {13,15,22,27,36,37,52,53,85,86,} {44,48,64,68,} {71,79,91,99,} {15,17,35,37,} {22,23,44,49,53,54,62,68,78,79,} {81,86,91,96,} {17,18,77,78,} {23,24,33,35,46,49,54,59,95,96,} {61,62,81,82,} {17,19,47,49,} {23,25,31,35,62,63,72,78,81,88,} {54,56,94,96,} {14,15,44,45,} {22,28,32,37,68,69,76,79,86,87,} {51,53,91,93,} {14,18,24,28,} {32,33,45,46,76,77,82,87,93,95,} {51,59,61,69,} {14,16,74,76,} {21,28,32,38,43,45,51,55,92,93,} {67,69,87,89,} {15,16,43,44,53,55,74,79,86,89,} {21,22,91,92,} {37,38,67,68,} {18,19,46,47,56,59,72,77,82,88,} {24,25,94,95,} {31,33,61,63,} {16,19,21,25,31,38,55,56,88,89,} {43,47,63,67,} {72,74,92,94,}

It has 36 ! That explains a lot. I cant believe I overlooked this feature !

Wheras the PT is entirely different [only 18 4-set unavoidables] but it has 30 4-perms and 6 8-perms gives 78 2-digit unavoidables. [according to RW's terminology]

Code: Select all: 123456789 457189326 689327154 231645897 745891632 896732541 318264975 574918263 962573418 {11,12,24,28,35,37,41,43,72,74,85,87,93,98,} {56,59,66,69,} {11,13,42,43,71,72,} {24,27,34,37,} {56,58,65,69,85,89,96,98,} {11,14,21,24,} {37,39,68,69,97,98,} {43,45,52,56,72,76,83,85,} {11,15,22,24,37,38,67,69,72,79,81,85,94,98,} {43,46,53,56,} {11,16,24,29,31,37,43,44,56,57,63,69,} {72,75,85,88,92,98,} {11,17,23,24,36,37,43,49,51,56,64,69,} {72,78,82,85,95,98,} {11,18,32,37,43,47,61,69,72,73,98,99,} {24,25,54,56,85,86,} {11,19,33,37,43,48,62,69,72,77,91,98,} {24,26,55,56,84,85,} {12,13,34,35,41,42,65,66,71,74,93,96,} {27,28,58,59,87,89,} {12,14,21,28,35,39,41,45,52,59,66,68,74,76,} {83,87,93,97,} {12,15,22,28,35,38,} {41,46,53,59,66,67,74,79,81,87,93,94,} {12,16,31,35,41,44,63,66,74,75,92,93,} {28,29,57,59,87,88,} {12,17,23,28,35,36,64,66,74,78,82,87,93,95,} {41,49,51,59,} {12,18,25,28,32,35,} {41,47,54,59,61,66,73,74,86,87,93,99,} {12,19,26,28,33,35,41,48,55,59,62,66,91,93,} {74,77,84,87,} {13,14,21,27,34,39,71,76,83,89,96,97,} {42,45,52,58,65,68,} {13,15,22,27,34,38,42,46,53,58,65,67,94,96,} {71,79,81,89,} {13,16,31,34,42,44,63,65,71,75,92,96,} {27,29,57,58,88,89,} {13,17,23,27,} {34,36,64,65,95,96,} {42,49,51,58,71,78,82,89,} {13,18,25,27,32,34,42,47,54,58,61,65,71,73,} {86,89,96,99,} {13,19,26,27,33,34,71,77,84,89,91,96,} {42,48,55,58,62,65,} {14,15,38,39,45,46,67,68,76,79,94,97,} {21,22,52,53,81,83,} {14,16,44,45,75,76,} {21,29,31,39,} {52,57,63,68,83,88,92,97,} {14,17,36,39,45,49,64,68,76,78,95,97,} {21,23,51,52,82,83,} {14,18,21,25,32,39,45,47,52,54,61,68,97,99,} {73,76,83,86,} {14,19,21,26,33,39,76,77,83,84,91,97,} {45,48,52,55,62,68,} {15,16,22,29,31,38,44,46,75,79,81,88,92,94,} {53,57,63,67,} {15,17,36,38,46,49,64,67,78,79,94,95,} {22,23,51,53,81,82,} {15,18,22,25,32,38,} {46,47,53,54,61,67,73,79,81,86,94,99,} {15,19,22,26,33,38,46,48,53,55,62,67,77,79,} {81,84,91,94,} {16,17,23,29,31,36,44,49,51,57,63,64,} {75,78,82,88,92,95,} {16,18,25,29,31,32,61,63,73,75,86,88,92,99,} {44,47,54,57,} {16,19,26,29,} {31,33,62,63,91,92,} {44,48,55,57,75,77,84,88,} {17,18,23,25,32,36,47,49,73,78,82,86,95,99,} {51,54,61,64,} {17,19,48,49,77,78,} {23,26,33,36,} {51,55,62,64,82,84,91,95,} {18,19,32,33,47,48,61,62,73,77,91,99,} {25,26,54,55,84,86,}

Still a remarkable grid - each clue has 8 2-digit unavoidables.

C

by **JPF** » Wed Dec 20, 2006 12:38 am

So, Max (X) = 36 ?
Why ?

JPF

by **coloin** » Wed Dec 20, 2006 12:59 am

I think I was assuming [grrr.] that the PT had the most - and I mixed up the 4-perm value with the 4-set unavoidable value.

I will try and dig out how dukuso found the grid

Ah yes......
He searched the t-invarients which had a high number [6] of 4-sets in a band/gangster.
He then found a grid [dukuso15] which was wholly composed of this all of this band [3-horizontal and 3 -vertical]. It still had only an MCN of 15, but it still had more 4-set unavoidables than the other grids found to be MCN 16. An MCN grid of 17 was not found.

JPF wrote:So, Max (X) = 36 ?

Whether there is a grid with more - I dont know...but I know someone who might ! It is possible to have a band with 7 4-set unavoidables.

Code: Select all: +---+---+---+ |123|456|789| |457|189|326| |869|732|415| +---+---+---+ |...|...|...| |...|...|...| |...|...|...| +---+---+---+ |...|...|...| |...|...|...| |...|...|...| +---+---+---+ 7 4-set unavoidables

Anyhow it certainly explains the chameleons found and the high number [22 at last count] of 35 clue minimals I [eventually ] found after the masses of 34s !

C

by **Red Ed** » Thu Dec 21, 2006 4:38 pm

Still enjoyably off-topic, here are all solutions, up to isomorphism, with 36 size-4 unavoidables (henceforth, "U4s"):

Code: Select all: 123456789457189236698372514215768943389541672764293851572634198831927465946815327 123456789457189236698372514269541873314798652785263941541627398876934125932815467 123456789457189236869327415296743851378512964541698327682971543734265198915834672 123456789457189236968372514215694378386527491794813625572941863649738152831265947 123456789457189236968372514215734968386921457794865123572698341649213875831547692 123456789457189236968372514219547368586213497734698125372865941645921873891734652 123456789457189236968372514219634857586927143734815962372541698645798321891263475 123456789457189236968372514291738465374265198685941327546813972732694851819527643 123456789457189326869372514214965873635847192978213465381624957546798231792531648 ***

The starred one is, essentially, dukuso15. There are no solutions with 37+ U4s.

These grids came from an exhaustive search for solutions with >=36 U4s, which worked as follows. Define a k-rookery to be a grid with all instances of k different digits filled in (I think that's the usual definition). Note that since 9_choose_2 = 36, any grid with >=36 U4s must contain a k-rookery that has >= k_choose_2 U4s. So, pick k=4 and loop over all essentially-different 4-rookeries that have >=6 U4s; for each, extend to a solution grid in all possible ways and count the number of U4s in each case.

btw, dukuso's intuition was good in searching for automorphic grids (actually just transpose-invariant), since all of the solutions listed above have a non-trivial automorphism (not always transpose) as well.

by **coloin** » Fri Dec 22, 2006 6:48 pm

Excellent.

Your penultimate grid

Code: Select all: 123456789457189236968372514291738465374265198685941327546813972732694851819527643

appears to be unique [for now] in that it does not have any U6s.

I have opened up furthur discussion in the structures thread.......
I think It looks promising for chameleons and maximum clues......

Back on thread though - I am pleased to report that the 10 random grids I suggested a study with here have an average number of U4s of 10.9 !

I hope to send JPF a rather larger sample of grids generated by dukuso's program - to see how near/far off the mark his program is.

Edit
Here are the results and the distribution of the number of U4s for dukusos suexg program, calculated from 500000 grids by JPF.

Code: Select all: E(X) 10.67 s(X) 3.36 Max(X) 28 Min(X) 0 0 57 1 287 2 1230 3 3472 4 8037 5 15262 6 24999 7 36356 8 46262 9 54267 10 57875 11 56726 12 51181 13 43013 14 33950 15 24711 16 17277 17 10752 18 6738 19 3884 20 1967 21 917 22 476 23 194 24 71 25 31 26 7 27 0 28 1 29 0 30 0 31 0 32 0 33 0 34 0 35 0 36 0

So a bias against high numbers....
C

by **Red Ed** » Sun Dec 31, 2006 7:32 pm

gsf wrote:I've posted new executables
I changed the -g option to take an optional value so that all generation parameters will be in one spot
see if this provides a better set of N solution grids

That change doesn't really seem to have made much difference, bias-wise. A test suggests that, if it's changed at all, the bias has got very slightly worse.

OK, what do I really mean by that? My testing currently works as follows. I've got a library of ~3000 single-band clue patterns chosen sort-of randomly. For each pattern, p, I know the mean, mean(p), and an upper bound, sd(p), on the standard deviation of the number of those patterns that exist in a perfectly randomly chosen solution grid. Given a test file of n (default: 1000000) solution grids generated by some algorithm, I count the number of patterns of each type, count(p). The %bias for pattern p is then 100 * ( count(p)/(n*mean(p)) - 1 ) and the Z-score, i.e. the corresponding point of the Normal distribution, is ( count(p)/sqrt(n) - sqrt(n)*mean(p) ) / sd(p).

The Z-score tells you how confidently you can reject the hypothesis that the algorithm being tested is unbiased, and it is this that the program reports -- or rather, just the top 20 most significantly-biased patterns. I'll include an option to sort by the bias itself rather than Z-score as well in future.

Here, for example, is what an early version of the code says about your new executable:

Code: Select all: .....2..8..............8..2 : score -104.158683 .87...1.22.........6124875. : score -102.226264 .34.9.87..5......289..35.6. : score -101.933796 6.2..4....79..8.65.58..1..7 : score -100.648167 3.1..975..4..6...3.97.2.1.. : score -100.332145 318..79...4......77.59.3.2. : score -99.924485 218.37...39..24..7.......8. : score -99.202956 ..2.9.1.....4....2..9.21... : score -98.538524 6...8..7...87.2.1.7..3...6. : score -97.511292 .7.93.2...82.6.....9.7.1..3 : score -97.164515 .9..3.586.34.172..7.2...4.. : score -96.701108 .14..7...5....268.29.6.5.7. : score -96.171057 .1.5...6..6.....1.....6.2.5 : score -96.112837 578.613...3....764.2..9..1. : score -95.832563 .8..745.15...1.2.797.6...34 : score -95.559644 7132..65......8.41..65.9.2. : score -95.264881 8..1...4.19.8..6.23.2.9...7 : score -94.503369 1.3....869..4..21.2..7..94. : score -93.969358 ..73...454...7..6..351..9.7 : score -93.828540 ..5.2.4.11...453...7.91..82 : score -93.523361

Not exactly user-friendly at this stage of development, but what it's saying is that the size-4 unavoidable ...

Code: Select all: ...|..2|..8 ...|...|... ...|..8|..2

... occurs far less frequently than expected. If I were to give a single-figure summary of that, I would say that your new executable "scored" -104. An older version of your sudoku.exe scored -104 (to the nearest int) as well. The sign indicates that (in this case) fewer patterns were observed than expected; a score of +104 would be an equally bad indication that more were seen.

I have assembled training sets of 1000000 solution grids from several different algorithms. Here are the headline Z-scores for each. Remember, it's the absolute value that matters; the smaller, the better.

+1.2 : my unbiased generator
-58 : anttiahola's B159 algorithm
-61 : Soultaker's "simulated annealing" algorithm
-104 : an old version of gsf's sudoku.exe
-104 : gsf's current sudoku.exe
-105 : solutions derived from dukuso's suexg
+133 : an old generator of my own (oh dear!)

In all cases bar the first, the size-4 unavoidable was at or near the top of the most list of most significantly biased patterns.

The score for the unbiased generator is perhaps surprisingly low given the number of patterns tested (the max of ~3000 random Z-values is typically ~3.5). The reasons for this are twofold: (i), the standard deviation of each pattern score is known only by an upper bound; and (ii), the pattern scores themselves are not independent. Since the effect of the approximations is conservative, I don't plan on trying to fix this.

The New Sudoku Players' Forum

Unbiased grid generation