## The hardest sudokus

Everything about Sudoku that doesn't fit in one of the other sections
merallas wrote:Thanks, it's more clear now. Looks quite complicate to me to find all those patterns by a program. However, it is still difficult for me to make a difference in guessing and non-guessing techniques. Also in the so called logical methods assumption are made in my opinion.

in this case there is a difference between something existing vs determining that something exists

a swordfish may exist at a position
determining that a swordfish exists may use techniques that look like guessing,
but that's separate from the swordfish existing
the same goes for cycles and tuples and locked sets

when a guess is made on a candidate, what structure/feature/pattern is being discerned?
the results of a guess are { contradiction solution indeterminate }
not easy to associate those with pattern/mathematic entities like fish and cycles and sets
gsf
2014 Supporter

Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Nice explanation, thanks

meralls
merallas

Posts: 39
Joined: 26 September 2006

merallas wrote:Ok, I understand. On the other hand I also see a challence in optimizing the so called brute force techniques.

Another point which have my interest is the clones of sudoku's. With a rather simple program clones can be generated. Of a particular sudoku, I have generated more than 14000! clones. One of them is shown below.

007004003200900000080010060009003004500100300060070080000005007000700100000020050

Is it realy possible to find puzzle, from which this clone is made.

Do you have a program to check that 2 sudokus are similar problem if you remove the symetries?
Ideally I would like to give a list of sudoku and remove the duplicates?
In your case the program should take the long list and return 1 line

I think I can code that but I dont want to reinvent the square wheel
thanks
dml

Posts: 34
Joined: 12 November 2006

gsf wrote:
tarek wrote:The long awaited update.....great.....

I've noticed that some older ratings have changed (slashed down ) when I re-ran them through this updated version, was that intentional ???

on nov 6 in this thread RW found a bug in my solver's proposition constraint logic
fixing that bug had the effect of lowering the ratings for puzzles previously affected by the bug
also, to accomodate the latest hardest some of the 993?? and 994?? ratings may have changed

stop finding harder puzzles and the ratings will stabilize

Now the distribution of values is less uniform than previously
we would expect finding much more 994xx than 995xx
I find 10 times more 995xx than 994xx
Probably others observed that
Also many values are like attractors : in particular around 9994X
I have no 9993X,9992X,9991X but plenty 9994X !!

Maybe this is normal, just a chaotic behavior
Any idea on that?

From a ranking program I would expect some logarithmic behavior
But I understand the challenge

If I find hard sudoku , what value should I reach to publish in this list
above 99800 or 99900 or 99950 or 99960
for Explainer above 10 or 10.5 or 10.6 or 11.0?
or combination of 2 values

can somebody explain in english terms what is expected from a sudoku to be considered very hard
For exemple: if I have 20 contradictory chains of length 20 , is it harder of 1 contradictory chain of length 60?
Probably statistics could be a good evaluator of the difficulties to find chain of length 30 vs length 50
I would be interested by the equation used to rank the sudoku
Should not all developers collaborate on this equation , I assume no single individual can find the best ranking program, and it will be even hard for a team of experts. If the implementation can be hidden I consider that
the ranking equation should be open and voted by others experts.

thanks
dml

Posts: 34
Joined: 12 November 2006

dml wrote:can somebody explain in english terms what is expected from a sudoku to be considered very hard
For exemple: if I have 20 contradictory chains of length 20 , is it harder of 1 contradictory chain of length 60?
Probably statistics could be a good evaluator of the difficulties to find chain of length 30 vs length 50
I would be interested by the equation used to rank the sudoku
Should not all developers collaborate on this equation , I assume no single individual can find the best ranking program, and it will be even hard for a team of experts. If the implementation can be hidden I consider that
the ranking equation should be open and voted by others experts.

the state of rating collaboration at this point is experimental
an ongoing search for characteristics/statistics that might be used in a rating equation
we can't work on the equation yet because we're still looking for characteristics
success of the collaboration will be witnessed by convergence of ratings for the hardest puzzle catalog

it even took a while to get convergence for the inferior and superior rating threads
gsf
2014 Supporter

Posts: 7306
Joined: 21 September 2005
Location: NJ USA

dml,

i already wrote about different kinds of rating here.You can see, that we cannot find a satisfying formula, its too much matter of taste, despite of all the difficulties to evaluate a rating for a given one.
Of course all suggestions for other formulas are welcome.
ravel

Posts: 998
Joined: 21 February 2006

tarek wrote:I'm not sure if anyone would be interested to know that a new update of the Sudoku explainer has been released .....

Try not to jam the site

tarek

Great thanks this code permit to explore more sudokus

Could you add an option to have the description of progress like in interactive mode
The idea is to run a huge number of sudokus to make statistics studies
and the find most beautifull steps

Thanks
dml

Posts: 34
Joined: 12 November 2006

Karlson wrote:These are the Top20 Sudokus with the new hardest option:
99960 001002000030040050600700800006000007010000030900000600007001008040030020000500900

Regards,
Karlson

if I made no mistake this sudoku called "dml1" is rated 99995

4....9....3..1..2...67.......1.....4.5.2...7.8.....6.......4..8.7..3..1....5..9..

same another format

400009000
030010020
006700000
001000004
050200070
800000600
000004008
070030010
000500900

Not tested with Explainer but should be high also
By the way I have also some sudokus rated 10.7 by Explainer but rated only 99955 by gsf
I hope to find one rated 1st by the 2 programs

If this one is valid and if you are interested by others sudokus I have many >=99960 and another 99995
Is there a chance I find one >=10000 ?
dml

Posts: 34
Joined: 12 November 2006

tarek wrote:I'm not sure if anyone would be interested to know that a new update of the Sudoku explainer has been released .....
tarek

this one fails to be solved logically by latest Explainer

4....9....3..1..2...67.......1.....4.5.2...7.8.....6.......4..8.7..3..1....5..9..

same another format

400009000
030010020
006700000
001000004
050200070
800000600
000004008
070030010
000500900

Should interest you
Do you have many that are not solvable?

At least we can see there are many beautiful steps
In the rating you seems to take into account only the hardest step

I would consider the product of steps to find the hardest element of the sudoku maybe a better evaluation
What do you think
Probably we need to sum the logs of the solving steps to avoid huge numbers
Also I am not sure how you consider the length of the chains in the cost of the step, this is for sure an important element in complexity if there is no shorter similar step
We can also consider the variety of methods involved in the resolution
Probably huge scale statistics would help to refine the equation, probably would require a C++ version of your program, without graphics should be easy to port

check also this one : rated also 99995 by gsf
700003000
040710020
005900000
001000007
060200090
800000500
000007008
090040010
000600300
Last edited by dml on Fri Dec 01, 2006 6:16 pm, edited 3 times in total.
dml

Posts: 34
Joined: 12 November 2006

dml wrote:if I made no mistake this sudoku called "dml1" is rated 99995

this one fails to be solved logically by latest Explainer

4....9....3..1..2...67.......1.....4.5.2...7.8.....6.......4..8.7..3..1....5..9..

You're right about the gsf rating, very impressive! However, with SE version 1.2 I get ER 10.6:
56 x Hidden Single 3 x Naked Single 9 x Pointing 1 x Claiming 3 x Naked Pair 1 x X-Wing 1 x Swordfish 3 x XY-Wing 1 x Forcing X-Chain 1 x Bidirectional Y-Cycle 3 x Bidirectional Cycle 4 x Forcing Chain 5 x Cell Forcing Chains 5 x Region Forcing Chains 26 x Dynamic Contradiction Forcing Chains 4 x Dynamic Region Forcing Chains 10 x Dynamic Contradiction Forcing Chains (+) 3 x Dynamic Contradiction Forcing Chains (+ Forcing Chains)

In fact, if there isn't a bug in either program, then all puzzles that gsf's program can rate at the moment can also be rated by SE. The nested forcing chains of SE can easily handle all propositions done by gsf's program.

Anyway, very nice puzzle, could be a new leader!

RW

PS. tarek is not the creator of Sudoku Explainer. It is Nicolas Juillerat, who is not a regular on this forum.
RW
2010 Supporter

Posts: 1000
Joined: 16 March 2006

dml wrote:check also this one : rated also 99995 by gsf
700003000
040710020
005900000
001000007
060200090
800000500
000007008
090040010

That's the same puzzle as "dml1", but with one redundant clue added in r2c4...

RW
RW
2010 Supporter

Posts: 1000
Joined: 16 March 2006

dml wrote:Do you have a program to check that 2 sudokus are similar problem if you remove the symetries?
Ideally I would like to give a list of sudoku and remove the duplicates?
In your case the program should take the long list and return 1 line

I think I can code that but I dont want to reinvent the square wheel
thanks

dml,
In order to test my solving program I have written a small program in qbasic that can scramble sudokus. I was wondering if my solving program solves srambled versions in a similar way.
In a section of the scrambling program a new generated sudoku version is compared with previous generated versions. In this case identical sudokus and sodokus with the same empty field pattern are not added to the list. This is a very simple subroutine. However, problem is that with increasing amount of sudokus it becomes more and more time consuming to check identicals. So if my program is suitable for your application depends on the size of your list. I am thinking in the terms of 15000 items. If you want to analyse much larger lists another language than qbasic is required.

merallas
merallas

Posts: 39
Joined: 26 September 2006

RW wrote:
dml wrote:check also this one : rated also 99995 by gsf
700003000
040710020
005900000
001000007
060200090
800000500
000007008
090040010

That's the same puzzle as "dml1", but with one redundant clue added in r2c4...

RW

But are they the same sudoku?
That was my question before , how can we recognize 2 similar sudokus are same problem?
Here for me they are 2 very similar but different sudokus
In fact their resolution should be logically different , even a minor difference make them different sudoku, like 1 single bit difference in a large number with 1 google digits make them 2 different integers

If we have 2 identical sudokus, I think they should be resolved by the same path by the ranking programs, logically there should be only 1 path to single sudoku after we remove the clones obtained by exchanging the rows,columns,rotations,symmetries,...

I think those 2 are very close but not identical sudokus, due to the method used to build them and similar shape, you have similar path to resolution, but not identical path, identical path would have same steps and remove same values in same order.
Here we have same steps but remove different values in different order.
If not this prove the ranking are not taking the shortest optimal path to resolution they should take. If 2 path are very close they should be explored in parallel until you observe 1 step simpler even after many steps
Ranking is probably an unsolvable NP complete problem

To recognize that this 2 sudokus dml1 and dml2 are not same problem this is easy only 1 square contains 4 elements and they are different then that's 2 different problems. You need to check they are not a digit rotation that would also make a false new sudoku

This open many questions , on relationship of sudokus and optimal ranking
dml

Posts: 34
Joined: 12 November 2006

dml,
In order to test my solving program I have written a small program in qbasic that can scramble sudokus. I was wondering if my solving program solves srambled versions in a similar way.
In a section of the scrambling program a new generated sudoku version is compared with previous generated versions. In this case identical sudokus and sodokus with the same empty field pattern are not added to the list. This is a very simple subroutine. However, problem is that with increasing amount of sudokus it becomes more and more time consuming to check identicals. So if my program is suitable for your application depends on the size of your list. I am thinking in the terms of 15000 items. If you want to analyse much larger lists another language than qbasic is required.

merallas[/quote]

check that dml1 and dml2 are different sudokus
The objective is to give short list 100 or 1000 sudokus with very few identical sudokus and to remove genetic clones
But to test your code you need a program that can generate from 1 sudoku all its brothers and sisters
I think that the easy way is to sort the sudoku at the end you need to finish with only 1 form
example: top left square need to contains the most digits
if 2 square have 4 digits like '7321' and '7341' sorted then the top square should contain the highest value
etc ....
with many others rules you should finish that all clones after sorting will give a single value
then take the scrambling program and check all forms finish in same position, logically we should publish this form even if not a beautiful shape.
Or we should publish the 2 forms, the sorted form as signature of unicity
dml

Posts: 34
Joined: 12 November 2006

dml wrote:check that dml1 and dml2 are different sudokus
The objective is to give short list 100 or 1000 sudokus with very few identical sudokus and to remove genetic clones
But to test your code you need a program that can generate from 1 sudoku all its brothers and sisters
I think that the easy way is to sort the sudoku at the end you need to finish with only 1 form
example: top left square need to contains the most digits
if 2 square have 4 digits like '7321' and '7341' sorted then the top square should contain the highest value
etc ....
with many others rules you should finish that all clones after sorting will give a single value
then take the scrambling program and check all forms finish in same position, logically we should publish this form even if not a beautiful shape.
Or we should publish the 2 forms, the sorted form as signature of unicity

search for "isomorphism" and "canonical" on this forum and the programmers's forum
gsf
2014 Supporter

Posts: 7306
Joined: 21 September 2005
Location: NJ USA

PreviousNext