Structures of the solution grid

Everything about Sudoku that doesn't fit in one of the other sections

Postby daj95376 » Sat Aug 12, 2006 4:18 pm

You are about as subtle as a bull in a china shop. However, I made the mistake of thinking that your asking a question meant that you were looking for an answer. Obviously, you have all of them already. I won't make that mistake again!!!
daj95376
2014 Supporter
 
Posts: 2624
Joined: 15 May 2006

Postby ravel » Sat Aug 12, 2006 7:25 pm

RW wrote:Good work, here's two more 35s to try your luck on:
Code: Select all
123456789456789123789132564275364918638591472914278356392617845561843297847925631
123456789456789132789231645291864573347915268568327914632578491814692357975143826


RW,

the probability, that an average grid contains a 17-clue is less than 1:100000. When a good property raises the chance for a 17-clue a 100 times, you still will have to check 1000 grids on average to find one.
While a couple of very qualified people only found a few 17-clues, when trying it from the grids properties side, gfroyle found thousonds (and probably more than 99%) of 17-clues with a (graph theory based) heuristic, that searched the neighbourhood (i.e. exchanging some clues) of known low clue sudokus.
When you look at the known 17-clues grids, you will find, that in average they have good properties (like MCN or unique pairs), but there are grids with bad properties also (the easier it is to find then the low clue with a heuristic, because there are not much promising subsets) and there are a lot of grids with a better property (example sfb) with no 17-clue.
Have you checked the 17-clues grids for unentwined pairs? I would be very surprised, if there were not many with a relatively low number.
ravel
 
Posts: 998
Joined: 21 February 2006

Postby RW » Sat Aug 12, 2006 11:03 pm

ravel wrote:Have you checked the 17-clues grids for unentwined pairs? I would be very surprised, if there were not many with a relatively low number.


Yes, I posted the results a while ago here. Yes, there were many with relatively low numbers, the average amount of fully entwined pairs is only 2.3 higher than for randomly generated grids. The average amount of 2-digit unavoidables is about 4 sets lower than the average on random grids. The maximum amount of 2-set unavoidables for a known grid with 17s is 63, 74 is the highest I've encountered in my random search.

The SFB grid, with better properties, has three 17s in it. I have found no other grid with 36 2-perms yet. And an interesting fact; I tested the grid with four 16-perms and a 17, it had MCN 14, which to me sounds suprisingly high for a 17!


As I've said many times, the purpose with this thread is not to claim that the entwined pairs have an effect, but to explore If they have any effect. Right now I believe that I've found more interesting things related to grids that require more clues than grids with 17s. I believe the main reason that there is only 3 known grids so far without 19s, is that the exhaustive scan of a grid takes several days. There must be several more in the 750 MCN 16 grids I have at the moment, and also in the few thousand grids with <12 fully entwined pairs (remember my 9-entwiner with MCN 13 and still no 19...). Right now I'm scanning the grid with 74 2-digit unavoidables and no unique 19s have been found yet (about halfway through the search). But as coloin pointed out, it's still very unlikely to find a grid with no 20. Maybe it's like the big step from 16 to 17, even if we find 30000 grids without 19s, there might not be any grids without a 20.

btw. I found a bug in checker (has reported too low MCN on several grids), who should I report it too? I pm'd Moschopulus, but he doesn't seem to read his pm's...

RW
RW
2010 Supporter
 
Posts: 1010
Joined: 16 March 2006

Postby ravel » Sun Aug 13, 2006 9:10 am

RW wrote:The SFB grid, with better properties, has three 17s in it.

Yes, i mixed it with the SSF grid.
btw. I found a bug in checker (has reported too low MCN on several grids), who should I report it too?

There is a contact email address at the end of this site.
ravel
 
Posts: 998
Joined: 21 February 2006

Postby RW » Sun Aug 13, 2006 1:09 pm

Found some interesting statistics here.

Moschopulus wrote:I ran through over 13,000 random grids
... and listed the MCNs. There were 7 with MCN = 15.


So we have 7/13,000 random grids with MCN 15 = 0,0538% of the grids. I've found 750/50,000,000 random grids with MCN 16 = 0,0015% of the grids. In my search I only included grids with 4 or more 16-permutables, and only picked out the grids with four non-overlapping 16-permutable pairs, there should also be more MCN 16s without this property. But still the 15s are only about 36 times more common than the 16s... Could there be a 17, or are the 16s so common only because it's easy to achieve with four 16-perms... A 17 would of course have to be made in a different way, including all 9 digits.

ravel wrote:There is a contact email address at the end of this site.


Thanks, how could I miss that!

RW
RW
2010 Supporter
 
Posts: 1010
Joined: 16 March 2006

Postby coloin » Mon Aug 14, 2006 12:20 am

Great work from RW...but ravel's comments are spot on. I never found a de novo 17, though dukuso made me a program which found the "best" 14 clues in the SF grid. see

I have looked at the first 35 grid and no "region" stood out, so probably there are a few 18s with many differing clues. Similar to the SSF grid. If we did find a new 17 in any of RW's grids however we would have to question the completeness of Gfroyle's series.

A MCN17 grid would be very cramped - it would certainly be a good canditate not to have a 20.

I too was interested in the MCN14grid with a 17.....it did seem a good example of the opposite end of the spectrum to the SF/SFB grids.

Code: Select all
439286157176593284258147396381754629795632841624918735542371968867429513913865472
000006050070000200008100000300054000000000801000000700540000060000020000000800000


I too felt it shouldnt really contain a 17, It didnt have many 18s either !
see
coloin wrote:However I think that all large unavoidables over 20 clues wont get in a maximal collection - especially not a collection of MCN size 14. [size restriction never mind clue restriction]
Having said that however......the MCN 14 grid would be the grid to analyse these maximal collections. [large MCN, only one 17 clue puzzle]
I think this grid, with such a high MCN, is unusual in that it actually has a 17. It must have a low number of maximal collections.........bear in mind that at least 11 of the clues in each maximal collection must [edit] have sole occupancy of their respective unavoidable set.


C
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

Postby RW » Mon Aug 14, 2006 7:02 am

This grid
Code: Select all
384217965512936478967458321841672593725349186693185247259763814136894752478521639

with 74 2-digit unavoidables has no 19 clue puzzles (proved by checker in only 2d 18h). But on my slower computer I found 55 20s in the first 2 hours, probably thousands more to be expected in the estimated 19 days... I think the gap between 19 and 20 clues might be even bigger than the gap between 16 and 17, feeling quite pessimistic about the chances to find a grid without 20s right now.

RW
RW
2010 Supporter
 
Posts: 1010
Joined: 16 March 2006

Postby RW » Mon Aug 14, 2006 1:17 pm

coloin wrote:I too was interested in the MCN14grid with a 17.....it did seem a good example of the opposite end of the spectrum to the SF/SFB grids.
Code: Select all
439286157176593284258147396381754629795632841624918735542371968867429513913865472


Apparently there's at least two such grids, I was refering to another grid:

Code: Select all
123457689456189237789263514278591463341628795965734128592346871637815942814972356


Don't worry if checker says MCN 13, this is one of the grids where it fails to count correctly (remove the mentioned clues in the maximum set and you'll see...).

RW
RW
2010 Supporter
 
Posts: 1010
Joined: 16 March 2006

Postby Moschopulus » Tue Aug 15, 2006 11:25 am

RW wrote:Don't worry if checker says MCN 13, this is one of the grids where it fails to count correctly (remove the mentioned clues in the maximum set and you'll see...).


Thanks very much to RW for pointing out this problem with checker and unavoid. It has now been fixed.

It was a very small problem, and any searches previously done with checker are still valid and correct.

There was a stupid mistake in finding the unavoidable sets. It only affected sets of size 6. As a result a small number of size 6 unavoidables of a certain type were not found by unavoid. This affected the MCN in a small number of cases, and also made the search by checker longer than it need be. No searches were incorrect, just a bit longer than necessary. So thanks to RW, checker is now faster!

I would like to reiterate that the MCN that unavoid gives is only a guess anyway ... the program uses a "decent collection" of unavoidables and finds a maximal number of disjoint ones from this collection, but it does not use ALL minimal unavoidables. There would be many thousands which would make the computation too long. The "decent collection" usually has a few hundred sets and takes about 2 minutes to compute. (For the impatient there is the -uquick option which gives an estimate of the MCN in about 2 seconds using fewer unavoidables.)

Thanks again to RW.
Moschopulus
 
Posts: 256
Joined: 16 July 2005

Postby coloin » Sat Aug 19, 2006 12:03 am

Thaks to Moschopolus for fixing that......I wonder how RW spotted it?

However it is increasing the number of unavoidables found in this grid - and there are a very high number of small ones........
Code: Select all
123456789
789123456
456789123
231548967
564937218
897612345
312875694
645391872
978264531   [canonvant]

Oldchecker128
Code: Select all
Found 163 unavoidable sets (85 of size 4 or 6).
The maximum # of disjoint unavoidable sets (max clique number -- MCN) is 14.


Newchecker128
Code: Select all
Found 166 unavoidable sets (88 of size 4 or 6).
The maximum # of disjoint unavoidable sets (max clique number -- MCN) is 14.


I doubt it has 19s - I am half way with checker - no 19s
Code: Select all
4095/4096 (24051 puzzles gen'd; 0 uniques found); ETTF 0d 00h 00m
Finished.  Total computation time was 2d 15h 11m.
A full search was done (all puzzles were enumerated).
No 19-puzzle exists for this grid (all puzzles were solved).
No pseudo-18-puzzle exists for this grid (all puzzles were solved).

There were no 20s produced on random generation of 3000000 puzzles ! Checker found a few 20s eventually however. It will take 10 days to count them all - awaited !

RW I think that this is because there are a larger number of small unavoidables with 3 clue types. These unavoidables must be relevant in increasing the chances of an MCN 17 perhaps.
C
Last edited by coloin on Sun Aug 20, 2006 8:25 pm, edited 1 time in total.
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

Postby RW » Sat Aug 19, 2006 10:26 pm

coloin wrote:Thaks to Moschopolus for fixing that......I wonder how RW spotted it?

Yes, indeed thanks for fixing it. I spotted it when I wanted to double check that there was no bug in my program that finds MCN 16s. I ran a few random grids that it had found through ckeckers unavoid and got MCN 15 for one of them, removing the clues in that set of unavoidables I could see that there was one unavoidable that it had missed.

coloin wrote:I think that this is because there are a larger number of small unavoidables with 3 clue types. These unavoidables must be relevant in increasing the chances of an MCN 17 perhaps.


I noticed that the lack of 19s seems to be related to the total amount of 2-digit unavoidable sets. There's 4 grids so far, the amount of 2-digit unavoidables: 68, 70, 73, 74. The average on random grids created with gsf's program is 55. The largest amount in a known grid with a 17 is 63. It's possible that I started examining the wrong thing when I looked at the individual pairs instead of the 2-digit unavoidables as a total. The amount of 2-permutable pairs varies very much between the different grids without 19s, but as you can see they have in common a very high amount of total 2-digit unavoidables. Of course the amount of small 3-digit unavoidables also must be important.

RW
RW
2010 Supporter
 
Posts: 1010
Joined: 16 March 2006

Good-ranking scores for various grid statistics

Postby Red Ed » Sun Aug 20, 2006 9:47 am

RW wrote:I noticed that the lack of 19s seems to be related to the total amount of 2-digit unavoidable sets. There's 4 grids so far, the amount of 2-digit unavoidables: 68, 70, 73, 74. The average on random grids created with gsf's program is 55. The largest amount in a known grid with a 17 is 63. It's possible that I started examining the wrong thing when I looked at the individual pairs instead of the 2-digit unavoidables as a total. The amount of 2-permutable pairs varies very much between the different grids without 19s, but as you can see they have in common a very high amount of total 2-digit unavoidables. Of course the amount of small 3-digit unavoidables also must be important.
I've recently been playing with Diff(G,2), which is related to the number of 2-digit unavoidables as follows. Think of a grid as being made up of nine templates, with template D being those cells filled in with digit D. We define Diff(G,k) to be the number of grids that differ from G by exactly k templates. If there are u(d,e) minimal unavoidables on digits (d,e) then Diff(G,2) = 2^u(1,2) + 2^u(1,3) + ... + 2^u(8,9) - 36. By contrast, you appear to be using Unav(G,2) = u(1,2) + u(1,3) + ... + u(8,9). Both are very good discriminators of grids containing a 17 versus grids without, with Diff(G,2) apparently slightly better than Unav(G,2) as I will now explain.

I've been evaluating different grid statistics, e.g. Diff or Unav, with a "good-ranking score", which is equal to the probability that a random grid containing a 17 beats (has a better statistic than) any old random grid. Ties (equal stats) are resolved by flipping a coin. I calculate this score by comparing the list of 34108 grids-with-a-17 with a separate list of 34108 grids generated at random. Of course this only gives me an approximation to the 17-beats-random probability, and I could improve that approximation by using more random grids, but it feels good enough.

The good-ranking scores for Diff(G,2) and Unav(G,2) are 81.2% and 80.6% respectively, which may or may not be a statistically significant difference. The SF grid is ranked very highly by both statistics, as you'd hope, with SFB coming top overall with perfect Diff and Unav stats as mentioned in earlier posts. You do can slightly better than Diff and Unav, at least on my limited sample of random grids, by using UnavSquared(G,2) = u(1,2)^2 + u(1,3)^2 + ... + u(8,9)^2, which comes in at 81.4%. You might also expect an improvement by considering three digits instead of just two, but not so: Diff(G,3) scores only 78.2%.

Of the other stats that I tried with a substantially different theme, only one scored anywhere near these figures: that was Wolfgang's 3-rookeries statistic which, when you count the number of rookeries rather than the number of cell pairs (his 30 & 60 vice his 2628 & 5049), scores 79.0%. Everything other method that I coded up scored c. 70% or lower.
Red Ed
 
Posts: 633
Joined: 06 June 2005

Postby coloin » Fri Aug 25, 2006 11:44 am

Excellent to have Red Ed On the case !

Ive got hold of a program which classifies the grids according to their 27 clue bands.
index416.exe
It gives a printout of the 44 gangsters [note I beleve they are allocated different numbers to Red Eds original classification] and /or the 416 bands.

Grids which have been proven not to have a 19
Code: Select all
Canonvant  123456789789123456456789123231548967564937218897612345312875694645391872978264531   
Dukuso16   145726983837495261926381574293874156581269347674153892318547629459632718762918435   
Rw         123469785456187932789253164371694528698512347542378691215946873864735219937821456   
74-2digit  384217965512936478967458321841672593725349186693185247259763814136894752478521639   


Code: Select all
               Gangsters                    Bands                      Min clues     Max clues   MCN   Av.puzz.size

Canonvant      1  27  27  ,   1  42  42     1  53  53  ,   1 138 138       20           34        14   25.33
Dukuso16      28  31  33  ,  33  28  31    97 139  52  ,  52  97 139       20           34        16   24.89   
Rw            42   7  33  ,  42   7  33   131 148 244  , 131 148 244       20           35        13   24.81
74-2digit     42  42  42  ,  26  42  31    51  70  70  ,  85 131  80       20           ?         15   24.90



RW recently showed there was no 19 in his grid with 74 -2 digit unavoidables. There were many 20s however.

Code: Select all
74-2digit output with checker
453/4096 (318499006 puzzles gen'd; 335 uniques found); ETTF 13d 17h 46m = 3029 estimated at finish

So perhaps there are around 3000 different 20s in this grid

Generating 5,000,000 minimal puzzles and the respective solution counts
Code: Select all
C:\suxx\suxx1>clusta 74-2digitcount5
lines:5000000 average clues:24.894509
   20  5
   21  1297
   22  44384
   23  413447
   24  1353856
   25  1788351
   26  1054895
   27  298023
   28  42389
   29  3215
   30  134
   31  4


Accepting the potential for errors in this calculation
I make it 3,000,000,000 puzzles in this grid

C
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

Postby Red Ed » Fri Aug 25, 2006 6:00 pm

A little while ago, RW wrote:The SFB grid, with better properties, has three 17s in it. I have found no other grid with 36 2-perms yet.

Here's another fully-entwined grid which, indulgently, I call Russell's Viper:
Code: Select all
123|456|789
456|789|123
789|231|564
---+---+---
234|178|956
695|342|817
817|695|342
---+---+---
368|527|491
572|914|638
941|863|275

Some facts about RV:
  • It's got 36 two-digit unavoidables (like SFB).
  • Removing any pair of boxes in the same stack leaves the grid uniquely solvable; removing any pair in the same band leaves the grid with two solutions. (SFB isn't quite as good on this point.)
  • Define a generalised diagonal to be three boxes no two of which are in the same band or stack (3! = 6 ways of achieving this). The number of grid completions from any generalised diagonal is always <130000. This beats SFB comfortably.
  • The generalised diagonals can be written D1~D2, D3~D4, D5, D6, where Di~Dj indicates isomorphism. SFB also has this unusual property.
  • RV has no 16s. No surprise there.
  • I wouldn't mind betting that it has no 17s either.:(
The stuff about the generalised diagonals makes me wonder if perhaps we can do even better, perhaps with three isomorph-pairs (all with low completion count) instead of just two, though I don't much fancy starting another search.
Red Ed
 
Posts: 633
Joined: 06 June 2005

Postby coloin » Fri Aug 25, 2006 6:32 pm

Excellent grid

Firstly I like the name - it does prevent confusion.

Secondly- I had noticed the SFB had unavoidable sets in half of its adjaecent boxes. But RVis much better.

In th SFB I had checked the 6 - generalied diagonal boxes - [egB159,B168...etc] and had noted they were all below average. Recently we checked low grid solutions of these diagonal boxes and noted there are many lower counts -I had thought of trying to find grid with all low counts - but you have done it.

The SFB also had consistently low box interaction values - [eg filling of B4 for a given B1/B5]. see Presumablly RV has this property.

Note checker is up and running - to a speed - so we will have an answer to the 17-clue puzzle in RV soon

C
coloin
 
Posts: 2502
Joined: 05 May 2005
Location: Devon

PreviousNext

Return to General