17-clue and 18-clue Sudoku update

Everything about Sudoku that doesn't fit in one of the other sections

Postby ravel » Sun Aug 12, 2007 12:21 pm

There has been some discussion about defining "distances" in the Sudoku space thread and also on earlier pages here (see e.g. Oceans's interesting alternative definition). For me it turned out, that there is no satisfying definition for all purposes, so as you said, it is a matter of point of view, which one fits better.
Red Ed wrote::!:Now a challenge:
  • The biggest distance I've seen between a pair of 17s is 16.
  • The smallest similarity I've seen between a pair of 17s is 8.
Can anyone do better? (I only checked a smallish random sample.)

I made some 10000 comparison's, but did not find better ones. A distance of 16 seems to be rather common for "exotic" puzzles outside the big 17-cluster, so maybe comparisons between those "island puzzles" have good chances to improve it.
There is a much better way: it's called the Hungarian (or Munkres') Assignment Algorithm.
Thanks for the link. At the first glance i did something similar, but i suppose it should be faster than my method.
ravel
 
Posts: 998
Joined: 21 February 2006

Postby Red Ed » Sun Aug 12, 2007 6:09 pm

ravel wrote:There has been some discussion about defining "distances" ... (see e.g. Oceans's interesting alternative definition).
fwiw, I came up with a very similar definition nearly a year earlier, but had (and still have) no clue about how to do any useful calculations with it.
Red Ed
 
Posts: 633
Joined: 06 June 2005

Postby Lars Petter Endresen » Mon Aug 13, 2007 10:00 pm

Hello all,

It's really fascinating to follow the excellent discussion about 17 and 18 clue Sudokus! This is clearly the most advanced and spectacular discussion in the Science of Sudoku! What is the total number of 17-clue Sudokus? Is there a 16? How are these 17-clue Sudokus related? Recently I have been involved in the Sudoku Architect project that Håvard manages, and I have been privileged to be able to find new 17-clue Sudokus every day: thanks to Intel Compiler Team!

In order to estimate how many 17-clue Sudokus are left, I did a set of 10 isolated experiments. Results are:
Code: Select all
          20   20+1-1     19     18   18+1-1   17  n17 n17+2-2
   #1   1914   34593   37256   5480   23363   207   2   2
   #2   3701   65420   68885   9070   36202   193   8  13
   #3   3191   55411   58159   7281   28596   190   5   5
   #4   2822   49995   52729   6250   24252   182   6   7
   #5   2687   46949   48740   5658   21236   107   8  12
   #6   6531  114346  120800  14640   56054   416  12  26
   #7   3006   51873   54749   7211   28964   209   4   6
   #8   2665   47653   50587   6573   24758   150   2   3
   #9   3543   63262   69287   8861   33489   169   6   8
   #10  3360   60654   69596   9154   35519   312   5   6
   Sum 33420  590156  630788  80178  312433  2135  58  88

Each experiment was based on isolated random sets of 20-clue Sudokus, and I followed the technique 20+1-1+1-2+1-2+1-1+1-2+2-2=17. In the table, n17 denotes the new Sudokus that was added to Gordon's list today. Even though all experiments are based on random generation of 20-clue Sudokus, I am not certain that this may lead to some intelligible statistical conclusions. Please feel free to comment and conclude…

Best Regards,

Lars Petter Endresen
Lars Petter Endresen
 
Posts: 7
Joined: 03 June 2007
Location: NORWAY

Postby Lars Petter Endresen » Mon Aug 13, 2007 10:35 pm

BTW: in my Sudoku experiment I suddenly came across this family...

Code: Select all
..57....9.12.......4..3.........42...........3....2..68..57...................84.
..57....9.12.......4..3.........42...........3....2..78..57...................84.
..67....9.12.......4..3.........42...........3....2..58..57...................84.
..67....9.12.......4..3.........42...........3....2..78..57...................84.
..57....9.12.......4..3.........42...........3....2..68..56...................84.
..57....9.12.......4..3.........42...........3....2..68..59...................84.
..56....7.12.......4..3.........42...........3....2..68..57...................84.
..5.6...7.12.......4..3.........42...........3....2..68..57...................84.
..5.6...9.12.......4..3.........42...........3....2..78..57...................84.
..56....9.12.......4..3.........42...........3....2..68..97...................84.
..5.6...9.12.......4..3.........42...........3....2..68..97...................84.
..57....9.12.......4.3..........42...........3....2..68..67...................84.
..6.7...9.12.......4..3.........42...........3....2..68..59...................84.
..76....9.12.......4..3.........42...........3....2..68..59...................84.
..7.6...9.12.......4..3.........42...........3....2..68..59...................84.
..67....5.12.......4..3.........42...........3....2..68..59...................84.


I guess they all have the same grandpa...

:)
Lars Petter Endresen
 
Posts: 7
Joined: 03 June 2007
Location: NORWAY

Postby JPF » Mon Aug 13, 2007 11:03 pm

About guessing the number of 17s.

What we are doing is like fishing in a lake...(catch, tag and release the fish)

Can we say that you generated 2135 17s and that only 88 were "new".
The success rate is 88/2135=4%

I don't remember, what is the MLE of the number of fishes in that case .

Can we say that 4% of say 42000 are still to be discovered ~ 1680-1700.
But of course, the fishing is not perfect and some strange animals are hidden and more difficult to cath than others.

In the good days, I get a success rate of 3% - 4% starting with a 19.

JPF
JPF
2017 Supporter
 
Posts: 6139
Joined: 06 December 2005
Location: Paris, France

Postby gsf » Tue Aug 14, 2007 5:43 am

Red Ed wrote:PS: regarding this ...
gsf wrote:I'm hoping there's a better way than running p! combinations on p matching positions,
but I don't see it right now
There is a much better way: it's called the Hungarian (or Munkres') Assignment Algorithm.

thanks for the reference
p! (a bad knee jerk reaction) down to 2**p (second attempt) down to p**3 (Hungarian modulo <= 9 clues mapping to possibly > 9 positions)
distance/similarity in my solver (not yet posted) are now in agreement with Red Ed and ravel
gsf
2014 Supporter
 
Posts: 7306
Joined: 21 September 2005
Location: NJ USA

Postby Mauricio » Tue Aug 14, 2007 5:54 am

gsf wrote:thanks for the reference
p! (a bad knee jerk reaction) down to 2**p (second attempt) down to p**3 (Hungarian modulo <= 9 clues mapping to possibly > 9 positions)
distance/similarity in my solver (not yet posted) are now in agreement with Red Ed and ravel

It would be a good idea that one could assign different point schemes for the Hamming distance. For example, the original Hamming distance gives 2 points for each clue in different position and 1 point for a clue in the same position and different value (am I right?), but if we assign 10 points for each clue in different position and 1 for same position and different value, we encourage more similar patterns, and so on. Could this be implemented?
Mauricio
 
Posts: 1175
Joined: 22 March 2006

Postby gfroyle » Tue Aug 14, 2007 1:29 pm

Mauricio wrote:For example, the original Hamming distance gives 2 points for each clue in different position and 1 point for a clue in the same position and different value (am I right?)


Hamming distance between two strings is defined to be the number of places in which the two strings differ.

For example, the distance between

Code: Select all
100200
200020


is three, because they are different in 3 places.

Code: Select all
100200
200020
^  ^^


As you noticed, this has the effect that a clue that is "moved" will add 2 to the distance, while one that is changed will add 1 to the distance, but this is a consequence of the definition and not really a free choice.

Gordon
gfroyle
 
Posts: 214
Joined: 21 June 2005

Postby Mauricio » Tue Aug 14, 2007 4:30 pm

gfroyle wrote:Hamming distance between two strings is defined to be the number of places in which the two strings differ.

For example, the distance between

Code: Select all
100200
200020

is three.


I see, but it depends how it is implemented. For example, if we take the above puzzle as the target puzzle, and if for each nonzero cell of the below puzzle we count:
  • 0 points if the clue above it is the same,
  • 1 point if the clue above it is not the same and nonzero,
  • 2 points if te clue above it is zero,
the the count is three to, and it always gives the hamming distance (wrong, see edit).
If the hamming distance is implemented this way, then we could change the values.

Edit: I see, that only works if the puzzles have the same number of clues. Nevermind.
Mauricio
 
Posts: 1175
Joined: 22 March 2006

re: Mauricio distance

Postby Pat » Wed Aug 15, 2007 10:29 am

Mauricio wrote:It would be a good idea that one could assign different point schemes for the Hamming distance.

For example, the original Hamming distance gives 2 points for each clue in different position and 1 point for a clue in the same position and different value (am I right?),
but if we assign 10 points for each clue in different position and 1 for same position and different value, we encourage more similar patterns, and so on.

Could this be implemented?



as gfroyle pointed out, this would no longer be the Hamming distance.
    but gsf could choose to implement this anyway,
    and call it the Mauricio distance.
User avatar
Pat
 
Posts: 4056
Joined: 18 July 2005

re: Lars Petter Endresen's new 17s

Postby Pat » Wed Aug 15, 2007 10:44 am

JPF wrote:
Lars Petter Endresen wrote:
Code: Select all
          20   20+1-1     19     18   18+1-1   17  n17 n17+2-2
   #1   1914   34593   37256   5480   23363   207   2   2
   #2   3701   65420   68885   9070   36202   193   8  13
   #3   3191   55411   58159   7281   28596   190   5   5
   #4   2822   49995   52729   6250   24252   182   6   7
   #5   2687   46949   48740   5658   21236   107   8  12
   #6   6531  114346  120800  14640   56054   416  12  26
   #7   3006   51873   54749   7211   28964   209   4   6
   #8   2665   47653   50587   6573   24758   150   2   3
   #9   3543   63262   69287   8861   33489   169   6   8
   #10  3360   60654   69596   9154   35519   312   5   6

   Sum 33420  590156  630788  80178  312433  2135  58  88

I followed the technique 20 +1-1 +1-2 +1-2 +1-1 +1-2 +2-2 =17

In the table, n17 denotes the new Sudokus that was added to Gordon's list today



Can we say that you generated 2135 17s and that only 88 were "new".
The success rate is 88/2135=4%


my interpretation of Lars Petter Endresen's report, is that the experiment produced 58 + 88 = 146 new 17s;

but it is unclear to me if these 146 come from a total of 2135 17s
or are we perhaps missing a column for 17+2-2 puzzles produced

~ Pat
User avatar
Pat
 
Posts: 4056
Joined: 18 July 2005

Postby daj95376 » Wed Aug 15, 2007 12:52 pm

I interpreted the LPE table as ...

Code: Select all
From ten runs:

 33420          (20s) generated =>
590156          (20s) generated =>
630788          (19s) generated =>
 80178          (18s) generated =>
312443          (18s) generated =>
  2135 existing (17s) +
    58 new      (17s) generated =>
    88 existing (17s)
daj95376
2014 Supporter
 
Posts: 2624
Joined: 15 May 2006

Postby ronk » Wed Aug 15, 2007 1:32 pm

OK, I'll try this guessing game too.:) Starting with 33,420 20s Lars generated 2,135 17s of which 58 were new. Then a 2-off/2-on run on these 58 yielded an additional 30 new 17s ... for a total of 88.
ronk
2012 Supporter
 
Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Postby JPF » Wed Aug 15, 2007 2:32 pm

ronk wrote:OK, I'll try this guessing game too.:) Starting with 33,420 20s Lars generated 2,135 17s of which 58 were new. Then a 2-off/2-on run on these 58 yielded an additional 30 new 17s ... for a total of 88.

It was my interpretation:)

JPF
JPF
2017 Supporter
 
Posts: 6139
Joined: 06 December 2005
Location: Paris, France

Postby ronk » Wed Aug 15, 2007 3:22 pm

JPF wrote:It was my interpretation:)

Sorry, I forgot you already reached that conclusion for your "success rate" post.
ronk
2012 Supporter
 
Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

PreviousNext

Return to General