The New Sudoku Players' Forum

by **StrmCkr** » Mon Aug 13, 2018 5:29 pm

Code: Select all: 123456789 123456789 123456789 ------------- 123456789 123456789 123456789 ------------- 123456789 123456789 123456789

Is probably the worse case scenario for your generator as described. As every col has 8 wrongs and box's have 6 wrongs.
randomly recollecting 2 cells on a row until all cols/box's have no wrongs would take a very long time.

i would add some intelligence to the first 5 rows generated so that it can only use valid placements within the rules of the puzzle, then use the random on the last 4, and swapping function on the last 4.
reducing the generation process time down considerable.

by **David P Bird** » Mon Aug 13, 2018 6:00 pm

Hi HKociemba1,
I saw your first post but did not follow up on it as we are using different computer languages and platforms. Today I visited your website and was very impressed by your pure mathematics skills.

It is indeed surprising to me that your generation method firstly works, and secondly never gets trapped in an infinite loop. However, if we were working with the same systems, I have the feeling it would be slower than the one I've described.

Now come the interesting contrasts: Your generator works completely unintelligently, but your solver reduces the trial and error by applying basic solving methods first. In my case, it is very nearly the opposite way around! The only intelligence my solver uses is to prioritise the cells with the fewest candidates when selecting which ones to use for the trial assignments, and testing for naked single tests when candidates are eliminated. Originally, it tested for hidden singles, naked and hidden pairs, and box/line interactions, but disappointingly, these just slowed it down.

It also appears that your solver stops at the first solution it finds, and that uniqueness testing is a separate operation, whereas mine looks for further solutions (See my exchange with Dobrichev above).

This means that when testing clue sets it is possible to tell which cells appear to be solvable, and should not be needed as clues. Unless every possible solution is tested this will not be known for sure, and although some entries in list will possibly change, it should get longer as clues are added.

Another aim is to start by breaking the puzzle down into sub-puzzles consisting of the cells that contain different sets of digits. The clues needed to solve the sub-puzzles are then found randomly. When the sub-puzzles are combined, any extra clues are added as needed. Effectively this would control within limits the distribution of clues across the digits.

I don't know, but perhaps we can increase the percentage of generated puzzles that don't prove to be trivial.

David PB

by **blue** » Mon Aug 13, 2018 6:43 pm

Hello hkociemba1,

hkociemba1 wrote:3. Count the number n1 of "wrong" numbers in the two columns c1 and c2 and in the two blocks b1 and b2 the (r,c1) and (r,c2) are in.

How do you define the number of wrong numbers ?
For example: For c1, is it the number that match the value in (r,c1) ?

If (r,c1) and (r,c2) are in the same "block"/box ... then what ? ... count twice ?

---------------

hkociemba1 wrote:An easy way to randomly generate a solution grid with same probability for each grid is used in my solver/generator I recently posted in the Software section.

Generating random grids with a uniform probablity distribution, has been a topic of interest, at times, for certain members of the forum.
Most simple approaches, fall short.
David writes:

Your generator works completely unintelligently, ...

From what I've seen, generators that actually do produce a uniform distribution, often use strange and unexpected methods.
Are you basing the claim that I underlined, on a mathematical proof ?

Cheers,
Blue.

by **StrmCkr** » Mon Aug 13, 2018 7:38 pm

Code: Select all: 123456789 123456789 123456789 ------------- 123456789 123456789 123456789 ------------- 123456789 123456789 123456789

its selecting Row 2 col 1,9 for example

and checks the error counts for all digits 8 wrongs on the col 1,9 and 6 wrongs for the box 1,3
then swaps the positions of R2c19 and counts again which goes down to 7 wrongs on the col x2, and 5 wrongs for the box x2
and keeps the swap.

randomly selects a new row and positions to test.
repeat... ?? instead of checking them all?

Code: Select all: If (r,c1) and (r,c2) are in the same "block"/box ... then what ? ... count twice ?

count the box twice for sure..col # would be different.

by **coloin** » Mon Aug 13, 2018 7:54 pm

One way to ascertain if a random grid generation is truly random ..... is to analyze the solution grids to see if they have the expected number of U4s that are found in random grids from the 5e9 catalogue
I cant find the classic heated discussions between our honorable members though !

by **m_b_metcalf** » Mon Aug 13, 2018 8:06 pm

coloin wrote:One way to ascertain if a random grid generation is truly random ..... is to analyze the solution grids to see if they have the expected number of U4s that are found in random grids from the 5e9 catalogue
I cant find the classic heated discussions between our honorable members though !

Did you look here?

Elsewhere, JasonLion wrote:I have a preliminary static mirror of the Programmers Forum up and running at http://programmers.enjoysudoku.com/.

There are links that don't lead to anything useful all over the site. Most of them will come up not found, but some will lead to the spam site currently at www.setbb.com, as well as elsewhere.

by **hkociemba1** » Mon Aug 13, 2018 8:24 pm

David P Bird wrote:It is indeed surprising to me that your generation method firstly works, and secondly never gets trapped in an infinite loop.

Indeed I cannot prove that it always works but I never entcountered such a situation.

David P Bird wrote: Your generator works completely unintelligently, but your solver reduces the trial and error by applying basic solving methods first.

I am not clear what you mean by "solver" here. There is nothing intelligent in this part either. Starting from the solution grid I just remove a clue randomly and test if the puzzle still can be solved (uniquely). If this is the case another clue is removed etc. If the puzzle is minimal that is no further clue can be removed usually the result is a relatively simple puzzle which often is solvable with naked/hidden singles only even if the SAT-method is allowed during the clue elimination process. Indeed it might be worth to experiment with a more intelligent way where the clues which are removed depend on the number of candidates left for this cell.

blue wrote:How do you define the number of wrong numbers ?
For example: For c1, is it the number that match the value in (r,c1) ?

For the two columns c1 and c2 and boxes b1 and b2 I count for each number from 1..9 the number of occurences in this column or box. If a number does not occur exactly once the error count is incremented. An error of 1 obviously can never occur with this definition.

blue wrote:If (r,c1) and (r,c2) are in the same "block"/box ... then what ? ... count twice ?

Good question! I never thought about this but in my current implementation indeed I count twice! I will see if there is any change in the behaviour if I only count once in this case.

blue wrote:From what I've seen, generators that actually do produce a uniform distribution, often use strange and unexpected methods.
Are you basing the claim that I underlined, on a mathematical proof ?

You are right, I cannot claim this because I cannot prove it. When we reverse the process and take a valid grid and do lets say 10 random row cell swaps - I just cannot imagine that the error count distribution depends on the grid we start with.

by **David P Bird** » Mon Aug 13, 2018 9:45 pm

hkociemba1 wrote:
David P Bird wrote: Your generator works completely unintelligently, but your solver reduces the trial and error by applying basic solving methods first.
I am not clear what you mean by "solver" here. There is nothing intelligent in this part either. Starting from the solution grid I just remove a clue randomly and test if the puzzle still can be solved (uniquely). If this is the case another clue is removed etc. If the puzzle is minimal that is no further clue can be removed usually the result is a relatively simple puzzle which often is solvable with naked/hidden singles only even if the SAT-method is allowed during the clue elimination process. Indeed it might be worth to experiment with a more intelligent way where the clues which are removed depend on the number of candidates left for this cell.

Your response caused me to go back to you earlier post where

you wrote:To generate a puzzle from this grid I randomly remove clues and test if the reamaining puzzle still is solvable. Solvability is sufficient if the solving procedure uses only basic methods like hidden/naked singels, block-line interaction etc. With the SAT-solver as method included I also have to test for uniqueness of the solution.

Now I think I appreciate the cause of our misunderstanding. You are removing cells from the clue set until the puzzle can no longer be solved using basic methods. However, the probability is high that some of the clues at the end would be unnecessary if more advanced methods were used. The true test of whether a set of clues is minimal is when a brute force solver reports that there are multiple solutions when any of the clues is removed.

I could not believe that that was what you were doing, so took it that the simple eliminations were made before you ran the SAT-solver.

The New Sudoku Players' Forum

Composing a Solution Grid

Re: Composing a Solution Grid

Re: Composing a Solution Grid

Re: Composing a Solution Grid

Re: Composing a Solution Grid

Re: Composing a Solution Grid

Re: Composing a Solution Grid

Re: Composing a Solution Grid

Re: Composing a Solution Grid