Sue De Coq Revisited Again (ASI#1)

Advanced methods and approaches for solving Sudoku puzzles

Postby ttt » Tue Nov 04, 2008 7:39 am

Hi DonM,
I missed Steve... I don't know what's happen to him, over two months...

ttt
ttt
 
Posts: 185
Joined: 20 October 2006
Location: vietnam

Postby DonM » Tue Nov 04, 2008 9:08 am

ttt wrote:Hi DonM,
I missed Steve... I don't know what's happen to him, over two months...
ttt


Hi ttt. I miss him also. Emailed him but no reply.
DonM
2013 Supporter
 
Posts: 487
Joined: 13 January 2008

Postby tarek » Wed Nov 05, 2008 3:16 pm

Hopefully some easy target practice ... could be seen as easier examples.
Code: Select all
..4...6...1.4.2.9.9.......3.5.6.9.2...........7.8.3.6.7.......1.2.3.8.7...5...8..
. . 4|. . .|6 . .
. 1 .|4 . 2|. 9 .
9 . .|. . .|. . 3
-----+-----+-----
. 5 .|6 . 9|. 2 .
. . .|. . .|. . .
. 7 .|8 . 3|. 6 .
-----+-----+-----
7 . .|. . .|. . 1
. 2 .|3 . 8|. 7 .
. . 5|. . .|8 . .

..5...6...9.2.1.5.1.......8.2.9.5.6...........4.1.6.7.4.......7.7.8.3.2...9...3..
. . 5|. . .|6 . .
. 9 .|2 . 1|. 5 .
1 . .|. . .|. . 8
-----+-----+-----
. 2 .|9 . 5|. 6 .
. . .|. . .|. . .
. 4 .|1 . 6|. 7 .
-----+-----+-----
4 . .|. . .|. . 7
. 7 .|8 . 3|. 2 .
. . 9|. . .|3 . .

..3...1...2.9.1.7.5.......2.9.2.3.6...........3.5.8.4.6.......4.5.8.7.9...8...7..
. . 3|. . .|1 . .
. 2 .|9 . 1|. 7 .
5 . .|. . .|. . 2
-----+-----+-----
. 9 .|2 . 3|. 6 .
. . .|. . .|. . .
. 3 .|5 . 8|. 4 .
-----+-----+-----
6 . .|. . .|. . 4
. 5 .|8 . 7|. 9 .
. . 8|. . .|7 . .
User avatar
tarek
 
Posts: 3762
Joined: 05 January 2006

Postby DonM » Wed Nov 05, 2008 3:34 pm

Thanks tarek; much appreciated.
DonM
2013 Supporter
 
Posts: 487
Joined: 13 January 2008

Postby ronk » Sun Nov 16, 2008 4:43 pm

The pattern below differs from the traditional Sue De Coq (SDC) because one of the "row cells" (r9c6) is located in the line-box intersection r9b8. Is this pattern a genuine SDC:?:

Code: Select all
Ruud50k #3284
......695.3.5....8....4......63...4.15.2.6.39.2...47......9....2....8.7.391......

After SSTS
 478   1478  278   | 178   1238  1237  | 6     9     5
 67    3     279   | 5     126   1279  | 4     12    8
 5     168   289   | 1689  4     129   | 3     12    7
-------------------+-------------------+------------------
 789   78    6     | 3     158   159   | 125   4     12
 1     5     4     | 2     7     6     | 8     3     9
 89    2     3     | 189   158   4     | 7     56    16
-------------------+-------------------+------------------
 4678  4678  78    |B14    9     235-1 | 125   56    12346
 2     46    5     |B146   3-16  8     | 9     7     1346
 3     9     1     |A467  A256  C257   |C25    8     46-2

 Sets: A = {r9c45} = {24567}; B = {r78c4} = {146}; C = {r9c67} = {257}

 Elims: r7c6<>1, r8c5<>16, r9c9<>2

[edit: thanks to PIsaacson, added the missing r8c5<>6]
Last edited by ronk on Sun Nov 16, 2008 8:10 pm, edited 1 time in total.
ronk
2012 Supporter
 
Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Postby PIsaacson » Sun Nov 16, 2008 11:00 pm

Ronk,

My solver treated this as a subset counting pattern of 6 cells, candidates {124567} with multiplicities {111111} spanning 2 sectors with reductions:
Code: Select all
r7c6 <> 1 - reduces multiplicities to {011111} invalid for 6 cells
r8c5 <> 1 - reduces multiplicities to {011111} invalid for 6 cells
r8c5 <> 6 - reduces multiplicities to {111101} invalid for 6 cells
r9c9 <> 2 - reduces multiplicities to {101111} invalid for 6 cells

It categorized it as a Distributed Disjoint Subset instead of an SDC due to my interpretation of the rules regarding the composition of a "genuine" SDC. But I'm highly confused (still re-reading this post as well as the original) since my solver also categorizes many SDCs as ALS chains that intersect.
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Postby PIsaacson » Mon Nov 17, 2008 1:01 am

Dropped the following 3 reductions by accidental copy/paste
Code: Select all
r1c4 <> 1
r3c4 <> 1
r6c4 <> 1

Same reason - forces multiplicities to {011111} for candidates {124567}
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Postby hobiwan » Tue Nov 18, 2008 7:26 am

ronk wrote:Is this pattern a genuine SDC:?:

I see no reason why it shouldn't be. r9c6 is in r9 and does not belong to r9c45. But if you don't like it you can always use

SdC: r9c456 - {24567} (r9c7 - {25}, r78c4 - {146}) or
SdC: r9c56 - {2567} (r9c7 - {25}, r789c4 - {1467})
hobiwan
2012 Supporter
 
Posts: 321
Joined: 16 January 2008
Location: Klagenfurt

Postby PIsaacson » Tue Nov 18, 2008 12:21 pm

hobiwan wrote:
ronk wrote:Is this pattern a genuine SDC:?:

I see no reason why it shouldn't be. r9c6 is in r9 and does not belong to r9c45. But if you don't like it you can always use

SdC: r9c456 - {24567} (r9c7 - {25}, r78c4 - {146}) or
SdC: r9c56 - {2567} (r9c7 - {25}, r789c4 - {1467})

SdC: r9c456 - {24567} (r9c7 - {25}, r78c4 - {146}) allows r3c4 <> 6 (I think?) so this can't be correct as indicated below.
SdC: r9c56 - {2567} (r9c7 - {25}, r789c4 - {1467}) doesn't conform with DonM's definition so I'm not sure if it infers r3c4 <> 6.
Code: Select all
......695.3.5....8....4......63...4.15.2.6.39.2...47......9....2....8.7.391......

After SSTS
 478   1478  278   | 78-1  1238  1237  | 6     9     5
 67    3     279   | 5     126   1279  | 4     12    8
 5     168   289   | 689-1 4     129   | 3     12    7
-------------------+-------------------+------------------
 789   78    6     | 3     158   159   | 125   4     12
 1     5     4     | 2     7     6     | 8     3     9
 89    2     3     | 89-1  158   4     | 7     56    16
-------------------+-------------------+------------------
 4678  4678  78    |A14    9     235-1 | 125   56    12346
 2     46    5     |C146   3-16  8     | 9     7     1346
 3     9     1     |*467  *256  *257   |B25    8     46-2

From DonM's lead-in to this topic:

Set A should be a bi-value cell in the box containing the core aals or aaals
Set B should be a bi-value cell in the same row/col as the core but external to the containing box
Set C should be a tri-value cell in the same row/col as B with the same candidates as B + 1 additional from the core - but it's aligned with A!!!

set A = r7c4 {14} - this meets DonM's definition
set B = r9c7 {25} - this meets DonM's definition
set C = r7c4 {146} - this doesn't conform to DonM's definition since it's in line with A instead of B???
core = r9c456 {24567} - aals so this doesn't conform to DonM's C definition which requires an aaals???
or is the core r89c4+r9c56 {124567} - aals but not aligned on a single row/col, which also is contrary to DonM's definition

We can't "swap" the alignment for C from B to A since that would infer r3c4 <> 6, which is invalid because r3c4 = 6 in the solution.

I have to confess that this one is easier for me to spot as an ALS chain
Code: Select all
set A = r789c4 {1467}, set B = r9c567 {2567} - Dual linked common restricted {67} resulting in the following:
r1c4 <> 1, r3c4 <> 1, r6c4 <> 1, r7c6 <> 1, r8c5 <> 16, r9c9 <> 2

Which is exactly the same reductions my solver found with subset counting for a 2 sector DDS of 6 cells r789c4+r9c567 candidates {124567} multiplicities {111111} posted above.

DonM,

I just can't make this one fit your description unless I really misunderstood or misread the very first posting. Is it because you are focusing on finding SDCs and "relaxing" the formal definition in order to make them easier to locate? I have to admit that I've gone back and using your technique, found SDCs in puzzles that stumped me before, so again thanks for this insight into manual solving.

Hobiwan,

Please excuse me for questioning your reply, but hopefully you can see why I'm confused by the informal and formal definitions of Sue deCoq. Every time I think I understand the exact pattern, it turns out to be something "close to", but not exactly an SDC. At least they are (most of the time) a 2 sector DDS, but often just an ALS chain.

I'm still uncertain if an SDC allows for more eliminations in some cases than a standard 2 sector DDS or an ALS chain.
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Postby DonM » Tue Nov 18, 2008 1:22 pm

PIsaacson wrote:DonM,

I just can't make this one fit your description unless I really misunderstood or misread the very first posting. Is it because you are focusing on finding SDCs and "relaxing" the formal definition in order to make them easier to locate? I have to admit that I've gone back and using your technique, found SDCs in puzzles that stumped me before, so again thanks for this insight into manual solving.


If that is true then my work is done.:) I really appreciate the last comment because this is why I am doing this, but I'm never sure whether it's really of much help or use what with the strong interest in other aspects of sudoku on the forum.

The way I see it, the original formal definition was expressed mathematically to lock down the definition the way these more complicated patterns are although apparently it has its limitations when it comes to some of the variations. Also, others have expressed the notation of a SDC to likewise try to describe what's going on in the pattern. However, my guess is that while that may work for people who have a certain background or talent, it doesn't appear to work for most of us who rely on pattern solving whenever we can, otherwise we would have been seeing far more attention given to SDC.

What I have described is not really a definition so much as a description of the pattern -there is a difference. The description should include probably 95% (possibly higher, but still a very rough guess) of the SDCs one is likely to find, but obviously there are variations being posted that don't exactly fit my description. I'm keeping a record of these variations and hope to post further on them.


I'm still uncertain if an SDC allows for more eliminations in some cases than a standard 2 sector DDS or an ALS chain.


Neither am I, but that relates to why I put up the ALS Chain Tutorial because the two subjects are closely related. It would be quite interesting to see if anyone can come up with a SDC elimination that can't be replicated otherwise. My guess is that if it is possible, it will be with one of the SDC variations rather than the generic form. Still, as I've said elsewhere, I don't much care because I don't see much evidence that people can readily see the 2 sector DDS patterns as they occur in SDC, yet I'm able to pick out SDC patterns with no trouble at all now using the simplified description.
DonM
2013 Supporter
 
Posts: 487
Joined: 13 January 2008

Postby PIsaacson » Wed Nov 19, 2008 12:47 am

The subject of frequency of SDCs in puzzles has been brought into question, so I ran my C++ solver against several well known sets of sudoku puzzles and collected statistics on:
1) Frequency/percentage of SDCs in a puzzle based on DonM's coherent simplification for manual location of SDCs.
2) Frequency/percentage of DDSs which span 2 sectors and are limited to 6 cells/candidates maximum for speed of execution.
3) Frequency/percentage of successful solutions based on using SSTS + SDC only.

Point (3) requires some explanation since my C++ solver differs somewhat from Simple Sudoku. My C++ solver executes in order:
Code: Select all
naked/hidden singles
row/col/box interaction 
subsets size 2-4 naked and hidden
fishing based on Pattern Overlay Method (POM) which is similar to Ronk's base/cover set fishing algorithm
coloring based on X-colors/X-cycles single digit analysis
dds subset counting limited to 2 sectors with {1111...} multiplicities for 6 cells/candidates.

I extended the top1465 to 7 and 8 candidates with corresponding order-of-magnitude increases in average solution times.
Code: Select all
Statistics for royle17 (max 6 candidates)

36628 puzzles took 438.40 sec total  average 11.97 msec/puzzle

Sue deCoq    992 out of  36628 2.71%
2 sec DDS   5005 out of  36628 13.66%
Solved     35918 out of  36628 98.06%

Statistics for ruud top10000 (max 6 candidates)

10000 puzzles took 4057.74 sec total  average 405.77 msec/puzzle

Sue deCoq   1415 out of  10000 14.15%
2 sec DDS   7885 out of  10000 78.85%
Solved        48 out of  10000 0.48%

Statistics for ruud top50000 (max 6 candidates)

50000 puzzles took 20363.12 sec total  average 407.26 msec/puzzle

Sue deCoq  29012 out of  50000 58.02%
2 sec DDS 104507 out of  50000 209.01%
Solved     17364 out of  50000 34.73%

Statistics for gsf8152 (max 6 candidates)

8152 puzzles took 656.81 sec total  average 80.57 msec/puzzle

Sue deCoq    206 out of   8152 2.53%
2 sec DDS    698 out of   8152 8.56%
Solved       229 out of   8152 2.81%

Statistics for top1465 (max 6 candidates)

1465 puzzles took 318.02 sec total  average 217.08 msec/puzzle

Sue deCoq    317 out of   1465 21.64%
2 sec DDS   1132 out of   1465 77.27%
Solved       576 out of   1465 39.32%

Statistics for top1465 (max 7 candidates)

1465 puzzles took 6802.01 sec total  average 4643.01 msec/puzzle

Sue deCoq    377 out of   1465 25.73%
2 sec DDS   1252 out of   1465 85.46%
Solved       589 out of   1465 40.20%

Statistics for top1465 (max 8 candidates)

202 puzzles took 11468.35 sec total  average 56774.00 msec/puzzle

Sue deCoq     45 out of    202 22.28%
2 sec DDS    131 out of    202 64.85%
Solved       141 out of    202 69.80%

Some observations:
1) Sue deCoqs are not needed to solve "easy" sudoku puzzles, so the royle17 showed a fairly low frequency/percentage of SDCs.
2) Sue deCoqs appear in about 20% (conservatively) of the "harder" sudoku puzzles.
3) Sue deCoqs promoted about 40% of the top1465 and top50000 puzzles to a solution.
4) The top1465 with the DDS 8 candidate limit was stopped early (after only 202 puzzles) due to the fact that I computed it would require 24 hours to complete and I had other things to do on the computer. The first 202 puzzles may not be indicative of the final results since the difference between 6 candidate DDSs vs. 7 candidate DDSs was negligible except for the run time increase.
5) Run time statistics (for max 6 candidates) are a relative measure of the number of potential DDSs tested, as opposed to degree of difficulty. That's why the top1465 experienced an order-of-magnitude+ increase in average run time going from 6 to 7 to 8 candidates, but without a significant difference in the frequency/percentage statistics.

So what does this all mean? I'm not 100% sure, but it looks to me like SDCs occur frequently enough in the more difficult puzzles to warrant some serious attention. Especially since they promote solutions in a significant percentage of the puzzles in which they occur.

Once again, thanks Don! SDCs are a valuable weapon to have in an arsenal of techniques.
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Postby David P Bird » Wed Nov 19, 2008 2:52 am

PIsaacson, From the manual analysis I've done, I've yet to find a true Sue de Coq that can't be expressed as a series of ALS chains, one per digit eliminated. I've therefore believed that if I have an efficient way of identifying ALS chains, I'd capture all SdC eliminations and a whole lot more too. Admitted my solutions won't be so elegant, but I've always thought I'd get there.

By way of illustration using Ronk's example which you classify as a "distributed disjoint set":

Code: Select all
Ruud50k #3284
......695.3.5....8....4......63...4.15.2.6.39.2...47......9....2....8.7.391......

After SSTS
 478   1478  278   | 78-1  1238  1237  | 6     9     5
 67    3     279   | 5     126   1279  | 4     12    8
 5     168   289   | 689-1 4     129   | 3     12    7
-------------------+-------------------+------------------
 789   78    6     | 3     158   159   | 125   4     12
 1     5     4     | 2     7     6     | 8     3     9
 89    2     3     | 89-1  158   4     | 7     56    16
-------------------+-------------------+------------------
 4678  4678  78    |B14    9     235-1 | 125   56    12346
 2     46    5     |B146   3-16  8     | 9     7     1346
 3     9     1     |A467  A256  C257   |C25    8     46-2

 Sets: A = {r9c45} = {24567}; B = {r78c4} = {146}; C = {r9c67} = {257}
 Elims: r136c4,7c6<>1, r8c5<>16, r9c9<>2

BTW I'd notate this example as:
(124567)DLS:r78c4,r9c4567[(25)r9, (14)c4, (67)b7] => r7c6<>1, r8c5<>16, r9c9<>2

ie a disjoint locked set with 6 cells to fill with 6 digits each restricted to 1 instance, as shown by listing the containing houses for each.
For me, this method of unpicking the notation makes it much easier to check for errors.

As ALS chains though:
(1=4)r7c4 - (4)r9c4 = (4-6)r9c9 = (6)r9c45 - (6=14)ALS:r78c4 => r136c4,r7c6,r8c5 <> 1
(6)r9c9 = (6)r9c45 - (61=4)ALS:r78c5 - (4)r9c4 = (4)r9c9 => r9c9 <> 2
(146)ALS:r789c4 = (7)r9c4 - (257)ALS:r9c567 = (6)r9c5 => r8c5 <> 6

Your post suggests that there are DLS eliminations to be had which can't be caught using ALS chains. I suspect that these are probably are for bigger sets than I would've analysed.

I'd appreciate a couple of examples of these to see how I might need to extend my manual search methods.

Don please don’t take this as a criticism, it isn't. All I'm saying is that if I'm going to plough through various ALS chains anyway, my methods shouldn't miss SDCs. Nowadays I do look briefly for the basic form by checking for two bi-values pointing to a line-box intersection containing the 4 digits which is very quick and easy to do, but I don't go looking for the more exotic variations.

[Edit 1] Additional (1) eliminations shown following PIsaacson's response.
[Edit 2] Rephrased aside to Don.
Last edited by David P Bird on Wed Nov 19, 2008 7:01 pm, edited 2 times in total.
David P Bird
2010 Supporter
 
Posts: 1043
Joined: 16 September 2008
Location: Middle England

Postby PIsaacson » Wed Nov 19, 2008 4:08 am

David,

I am an avid fan of ALSs, and I still look at most SDCs as ALS chains. But I also look at them as a complete unit ala DDS and use multiplicity computations to find all the reductions.

For instance, in Ronk's example that you cite, you didn't include the r136c4 <> 1 reductions due to the alignment of the 1's in the DDS, or due to looking at it as a simple dual-linked ALS chain: r789c4 {1467} = r9c567 {2567} common restricted {67}

In both the DDS/ALS cases, the reductions are the same. It's the SDC that gives me grief because of the elimination rules.

Your ALS chains look like niceloops with embedded ALS groups? If so, then they are not quite the same as what I refer to as ALS chains.
Ref: http://www.sudoku.org.uk/SudokuThread.asp?fid=4&sid=9448&p1=1&p2=11
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Postby eleven » Wed Nov 19, 2008 4:13 am

Code: Select all
 478   1478  278   | 178   1238  1237  | 6     9     5
 67    3     279   | 5     126   1279  | 4     12    8
 5     168   289   | 1689  4     129   | 3     12    7
-------------------+-------------------+------------------
 789   78    6     | 3     158   159   | 125   4     12
 1     5     4     | 2     7     6     | 8     3     9
 89    2     3     | 189   158   4     | 7     56    16
-------------------+-------------------+------------------
 4678  4678  78    |*14    9     235-1 | 125   56    12346
 2     46    5     |*146   3-16  8     | 9     7     1346
 3     9     1     |*467  #256  #257   |@25    8     46-2


Sue de Coq's like ronks above (or those in tarek's puzzles) with a bivalue cell in the row or column are rather easy to find.

For each bivalue cell look at the boxes that cross the row and column. If you find 2 cells in the minirow(-column), which contain the digits, look, what is left. E.g. 89 in r6c1 leaves 15 in r6c45. Now if in this box you can find another cell with 15 or 2 only containing 15x or 3 with only 15xy, you have one.
Not in this case, but with the 25 in r9c7. It leaves 67 in r9c56 and there is a 3-cell ALS (1467).

If it leaves 3 numbers, like the 78 in r4c2 does in r13c2, you would need 2 146's (or a 3 cell ALS 146x) in the rest of the box.
eleven
 
Posts: 3150
Joined: 10 February 2008

Postby ronk » Wed Nov 19, 2008 4:16 am

hobiwan wrote:
ronk wrote:Is this pattern a genuine SDC:?:

I see no reason why it shouldn't be. r9c6 is in r9 and does not belong to r9c45. But if you don't like it you can always use

SdC: r9c456 - {24567} (r9c7 - {25}, r78c4 - {146}) or
SdC: r9c56 - {2567} (r9c7 - {25}, r789c4 - {1467})

Including my original (using your notation) ...

SdC: r9c45 - {24567}, (r9c67 - {257}, r78c4 - {146})

... I agree that all three are genuine SDCs.
ronk
2012 Supporter
 
Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

PreviousNext

Return to Advanced solving techniques