## Two-Sector Disjoint Subsets

Advanced methods and approaches for solving Sudoku puzzles

### Two-Sector Disjoint Subsets

Here's a new technique. It doesn't crop up that often and, since we now have forced chains to crack all but the toughest puzzles, it arguably isn't of much practical assistance. However, it's very elegant, so here goes. Consider the following puzzle:

Code: Select all
` 4 . . | 2 . . | 1 . .  . . 6 | . . 4 | . . .  7 . . | 8 5 . | . . . -------+-------+------  2 5 . | 4 . . | . . 1  . . . | . . . | . . .  8 . . | . . 3 | . 9 7 -------+-------+------  . . . | . 7 9 | . . 3  . . . | 1 . . | 6 . .  . . 8 | . . 2 | . . 4 `

Standard techniques take us to:

Code: Select all
`        4      3|8|9      3|5 |  2  3|9  6 |      1      7    5|8   1|3|5|9  1|2|3|8|9        6 |  7  3|9  4 |  2|8|9  2|3|8  2|5|8         7      2|3|9      2|3 |  8    5  1 |  2|4|9  2|3|4      6 ------------------------------+------------+---------------------         2          5        9 |  4    8  7 |      3      6      1       3|6      3|6|7      3|7 |  9    1  5 |  2|4|8  2|4|8    2|8         8        1|4      1|4 |  6    2  3 |      5      9      7 ------------------------------+------------+---------------------       1|6    1|2|4|6      1|4 |  5    7  9 |    2|8  1|2|8      3       3|5      2|3|7  2|3|5|7 |  1    4  8 |      6    2|5      9     1|5|9        1|9        8 |  3    6  2 |      7    1|5      4 `

The cells {r1c2,r2c2,r3c2} must contain three of the five values {1,2,3,8,9}. Now, the set can't contain 2 and 3, or we wouldn't be able to fill r3c3. Similarly, the set can't contain 1 and 9, or we wouldn't be able to fill r9c2. Therefore, the three cells must contain an 8, one of {2,3} and one of {1,9}. Critically, this means that the values {2,3} in Box 1 and the values {1,9} in Column 2 are accounted for, so we're in a position to make lots of eliminations - 3 from r1c3 (gives us a 5), 3 from r2c1, 1 from r6c2 (gives us a 4) and 1 from r7c2. Magic! Even better, there are similar constructions elsewhere. Note {r2c7,r2c8,r2c9} contains three of {2,3,5,8,9}, with the other values in r1c9 and r2c5 and that {r7c1,r7c2,r7c3} contains three of {1,3,5,6,9} with the other three values in r5c1 and r9c2, so we eliminate r2c1=9, r2c2=3, r2c2=9 and r7c3=1. Sudoku carnage!

I'll write at greater length tomorrow but in the meantime try the following puzzles, which, if you've understood the above, you should be able to solve without recourse to forced chains:

Code: Select all
` . 8 . | . 2 . | . . .  1 . . | 7 . . | 6 . .  . . 3 | . . 5 | . . . -------+-------+------  8 . . | 9 . . | 1 . 2  . 7 . | . . . | . 3 .  4 . 5 | . . 6 | . . 7 -------+-------+------  . . . | 1 . . | 5 . .  . . 9 | . . 4 | . . 8  . . . | . 3 . | . 7 .  . . . | 2 . . | . 5 .  . . . | 4 . 9 | . . 6  . . 7 | . . . | 4 . 1 -------+-------+------  8 . . | . . 7 | 2 . .  . 3 . | . . . | . 9 .  . . 9 | 6 . . | . . 7 -------+-------+------  4 . 2 | . . . | 3 . .  6 . . | 8 . 5 | . . .  . 9 . | . . 1 | . . . `
Sue De Coq

Posts: 93
Joined: 01 April 2005

### Re: Two-Sector Disjoint Subsets

Sue De Coq wrote:Here's a new technique.

Brilliant.
A 'Sue De Coq' .
angusj

Posts: 306
Joined: 12 June 2005

I found the 'Sue de Coq' which unlocked your first example:

In the second example I can see the 'Sue de Coq' but it doesn't seem to provide any eliminations:

I've just been through the first 25 puzzles of my 'unlimited' sudoku collection (puzzles which can't be solved without forcing chains or other 'exotic' techniques) and unfortunately couldn't find one puzzle where this technique could be used.
angusj

Posts: 306
Joined: 12 June 2005

i. In the illustrative example, I should have pointed out that in principle it's possible to remove 8s from Box 1 and Column 2 - however, there are no candidate 8s to remove.

ii. In the second test puzzle, the key cells are {r4c8,r4c9}. The values {1,3,4,6} are shared between these two cells, r4c2 and r6c8.

iii. The most general form of the pattern is as follows.

Consider the set of unfilled cells C that lies at the intersection of Box B and Row (or Column) R. Suppose |C|>=2. Let V be the set of candidate values to occur in C. Suppose |V|>= |C|+2. The pattern requires that we find |V|-|C| cells in B and R, with at least one cell in each, with candidates drawn entirely from V. Label the sets of cells CB and CR and their candidates VB and VR. Crucially, no candidate is allowed to appear in VB and VR. Then C must contain V\(VB U VR) [possibly empty], |VB|-|CB| elements of VB and |VR|-|CR| elements of VR. The construction allows us to eliminate the candidates V\VR from B\(C U CB) and the candidates V\VB from R\(C U CR).
Sue De Coq

Posts: 93
Joined: 01 April 2005

Sue De Coq wrote:iii. The most general form of the pattern is as follows.

OK, I hadn't appreciated the more generalised form. Well done again.
angusj

Posts: 306
Joined: 12 June 2005

I recently ran across this "Sue De Coq" which yielded 10 eliminations, the most I've seen so far.

Code: Select all
`sets A = {r7c5,r8c5,r9c5} = {23459}   (orange)     B = {r7c6} = {34}   (pink)     C = {r5c5} = {29}   (pink)Excluding sets A, B, and C ...    Candidates 3, 4, and 5 may be eliminated from box 8   (blue -- 5 elims)   Candidates 2, 9, and 5 may be eliminated from col 5  (green -- 5 elims)`

Including the above #121, my computer program found 22 puzzles of the top870 that could be advanced by the SueDeCoq ... after basic techniques were stymied. The "basic techniques" did not include forcing chains, except for the xy-wing. The puzzle numbers are:
Code: Select all
`The 22 puzzle numbers are:    17,  20,  62,  64, 121, 182, 238, 244, 307, 335, 372,    408, 489, 501, 589, 616, 628, 633, 651, 673, 710, 717`

In the above sequence, the puzzles are:

..247..58..............1.4.....2...9528.9.4....9...1.........3.3....75..685..2...
.237....68...6.59.9.....7......4.97.3.7.96..2.........5..47.........2....8.......
38.6.......9.......2..3.51......5....3..1..6....4......17.5..8.......9.......7.32
53..2.9...24.3..5...9..........1.827...7.........981.............64....91.2.5.43.
.6....83..25....1.....16.....7.6....8..4.57.3.......8...2...19..5...8...4....7...
.3........7.6.19...4.....2.2..8.6.....9...1......5....4....2..5.....8.1.....1.734
1...7956.8....2.............61..79..3..1.........6.4....2.9....6.9...14....8...2.
1..64.....7.......9.4.....1....6..79.3..7...4.4..583...5....9...........4.629.5..
..7..2...9..3..4..........6.8..7..2..13....4......9.1.7..9.6.....1...8..2....5...
.2.4.5........9.......3.28....3....93..8.47..4....652...1......64.9....27.....3..
...39..72.......95.7...53..............1..2....7.48.1..69....5..3...98..5.87...4.
....6...1.3.2..6..8..49.....5....1.72........7.4....3.6...7.5.......2..8..28.476.
..2...4..9.563.8..........6....6........73.9...8.15...7.63...5.8......4.3...2.1..
.1.........5...9..2...8...5...6...7..4.3...1.5.9..1..8....2.4..1..8.3.6......7...
.7......69..73....8.2.5.9..5........4...96....8.4..2......8.15...35....4...9.....
9...6.1.......1......2...76..23.5....8.9....2.6.....8.4.3........57..6......5.93.
.3....2..9.4.7.5.11...5.3.....4.5.........7.96...3...5.9....6...8...2...4..1.7..3
.3..8...4......36..6...1..2..3..2.......3815...8....2.2...64..9.862......7.......
35.6.......4.......2..7.89......5....7..1..2....4......36.9..1.......5.......7.82
.56...3...7.....8.8.3..1..7.1.78.9.....9............432856.........2...5.6...91..
..3.7..6.8..35..........3.127.9..8...9...8.....5...61..1.....4.........27...641..
.4....37...3.7.49...7.8......16............2359...2...........96.51.........3.51.

Disappointingly, only 2 of the 22 were subsequently solvable with "basic techniques" only. The puzzles advanced to a solution were #20 and #121 (#2 and #5 above).

For a related topic, see Almost locked rules
ronk
2012 Supporter

Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

AngusJ & Sue de Coq

In that second grid, would a better example be (r4c8, r4c9) and r4c2, r6c8...lots of eliminations there, I think.
Myth Jellies

Posts: 593
Joined: 19 September 2005

Myth Jellies wrote:In that second grid, would a better example be (r4c8, r4c9) and r4c2, r6c8...lots of eliminations there, I think.

That was also Sue's advice to Angus. BTW, I count 3 eliminations.
ronk
2012 Supporter

Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Sue De Coq wrote:iii. The most general form of the pattern is as follows.

Consider the set of unfilled cells C that lies at the intersection of Box B and Row (or Column) R. Suppose |C|>=2. Let V be the set of candidate values to occur in C. Suppose |V|>= |C|+2. The pattern requires that we find |V|-|C| cells in B and R, with at least one cell in each, with candidates drawn entirely from V. Label the sets of cells CB and CR and their candidates VB and VR. Crucially, no candidate is allowed to appear in VB and VR. Then C must contain V\(VB U VR) [possibly empty], |VB|-|CB| elements of VB and |VR|-|CR| elements of VR. The construction allows us to eliminate the candidates V\VR from B\(C U CB) and the candidates V\VB from R\(C U CR).

Very impressive. I wish I'd thought of this extension. Only a minor comment for future visitors, the preconditions are slightly too strict; it is sufficient that C contains no candidates that occur elsewhere in the intersection.
Pep

Posts: 12
Joined: 29 November 2005

Pep wrote:
Sue De Coq wrote:iii. The most general form of the pattern is as follows.

Consider the set of unfilled cells C that lies at the intersection of Box B and Row (or Column) R. Suppose |C|>=2. Let V be the set of candidate values to occur in C. Suppose |V|>= |C|+2. ....).

... the preconditions are slightly too strict; it is sufficient that C contains no candidates that occur elsewhere in the intersection.

Would you please expand upon your point, or preferably even illustrate with an example?

TIA, Ron
ronk
2012 Supporter

Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

ronk wrote:
Pep wrote:
Sue De Coq wrote:iii. The most general form of the pattern is as follows.

Consider the set of unfilled cells C that lies at the intersection of Box B and Row (or Column) R. Suppose |C|>=2. Let V be the set of candidate values to occur in C. Suppose |V|>= |C|+2. ....).

... the preconditions are slightly too strict; it is sufficient that C contains no candidates that occur elsewhere in the intersection.

Would you please expand upon your point, or preferably even illustrate with an example?

TIA, Ron

In a normal 9x9 Sudoku, let the three cells at a box/row intersection be labelled X, Y and Z with candidates (1234), (1234) and (89) respectively. Then the set C can consist of just {X,Y}, although Z is not filled. The wording of the original precondition has an (English language) implication that C contains all unfilled cells at the intersection. As I said, a minor comment.

Pep
Pep

Posts: 12
Joined: 29 November 2005

Pep wrote:In a normal 9x9 Sudoku, let the three cells at a box/row intersection be labelled X, Y and Z with candidates (1234), (1234) and (89) respectively. Then the set C can consist of just {X,Y}, although Z is not filled.

I don't think that's correct. I believe Z could also contain one or more of candidates (1234) and those candidates could safely be eliminated.
ronk
2012 Supporter

Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

ronk wrote:
Pep wrote:In a normal 9x9 Sudoku, let the three cells at a box/row intersection be labelled X, Y and Z with candidates (1234), (1234) and (89) respectively. Then the set C can consist of just {X,Y}, although Z is not filled.

I don't think that's correct. I believe Z could also contain one or more of candidates (1234) and those candidates could safely be eliminated.

I agree. I think one needs to test for all combinations (C = XY, XZ, YZ and XYZ), and see if any suitable "satellites" (visible tuples made up of candidates from V) can be found.

Hmm, I think I'll just add this to my solver soon and report back any surprises. ;)
vidarino

Posts: 295
Joined: 02 January 2006

vidarino wrote:
ronk wrote:
Pep wrote:In a normal 9x9 Sudoku, let the three cells at a box/row intersection be labelled X, Y and Z with candidates (1234), (1234) and (89) respectively. Then the set C can consist of just {X,Y}, although Z is not filled.

I don't think that's correct. I believe Z could also contain one or more of candidates (1234) and those candidates could safely be eliminated.

I agree. I think one needs to test for all combinations (C = XY, XZ, YZ and XYZ), and see if any suitable "satellites" (visible tuples made up of candidates from V) can be found.

Yes, agreed. I was too hasty in my immediate reaction.

Having thought about it, I now think I understand disjoint sets and have come up with the following characterisation.
Given
0.An integer N > 0.
1.A set C of N cells {C1, C2, ... CN}, all distinct
2.A set V of N candidate values {V1, V2, ... VN}
3.If value v is a candidate in Cj, then v is a member of V; and if v is a member of V, then there is a member Ck in C which has v as a candidate. This ensures that V contains all and only those candidate values that are found in at least one member of C.
4.If Vi is a member of V, then there exists at least one unit (i.e. row|col|box) Ui such that for each Cj in C, if Cj has Vi as a candidate, then Cj is in Ui. This means that all the cells in C which contain the candidate Vi are all in the same unit. (There may be more than one such unit for a given Vi.)
5.If Vi = Vj, then i = j. In other words, all the Vi are distinct. Even this restriction can be relaxed – see later.

Then
For each Ui defined in condition 4, Vi can be eliminated from all cells in Ui which are NOT in C.

Proof
Condition 4 ensures that for each Vi in V, there can be no more than one cell in C which contains Vi in the final solution, since all places where Vi are found are in the same unit.

But if Vi does not exist in C, then, by the pigeonhole principle, some other Vk must exist more than once, which is impossible. Hence there is exactly one of each Vi in C.

So in each Ui, the cells that could contain Vi have been identified, and the others can have Vi eliminated from them.

When N is 1,2 or 3, then it is the same as naked singles, pairs and triples.

Example 1 with 6 cells:
Code: Select all
`+---------------+---------------+---------------+|  .    .    .  |  .    .    .  |  .    .    .  ||  .    .    .  |  .    .    .  |  .    .    .  ||  .    .    .  | ab    .    .  |  .    .    .  |+---------------+---------------+---------------+|  .    .    cd | abcd cdef  .  |  *c   *d   .  ||  .    .    .  | abef  ef   .  |  .    .    .  ||  .    .    .  |  .    .   *ef |  .    .    .  |+---------------+---------------+---------------+|  .    .    .  |  *a   .    .  |  .    .    .  ||  .    .    .  |  *b   .    .  |  .    .    .  ||  .    .    .  |  .    .    .  |  .    .    .  |+---------------+---------------+---------------+(ab) are restricted to c4, (cd) to r4, and (ef) to box5, but the six values are in the 6 cells. The starred candidates can be eliminated.`

Example 2 with six cells:
Code: Select all
`+---------------+---------------+---------------+|  .    .    .  |  .    .    .  |  .    .    .  ||  .    .    .  |  *4   .   *26 |  .    .    .  ||  .    .    .  |  .    .    .  |  .    .    .  |+---------------+---------------+---------------+|  .    .    .  |  15   .   126 |  *1   .    .  ||  .    .    .  |  45   .    56 |  .    .    .  ||  .    .    .  |  .    *5   .  |  .    .    .  |+---------------+---------------+---------------+|  .    .    .  |  34   .   236 |  *3   .    .  ||  .    .    .  |  .    .    .  |  .    .    .  ||  .    .    .  |  .    .    .  |  .    .    .  |+---------------+---------------+---------------+`

Example 3 with 8 cells:
Code: Select all
`+---------------+---------------+---------------+|  .    .    .  |  .    .    .  |  .    .    .  ||  .    .    .  |  .    .    .  |  .    .    .  ||  .    .    .  |  .    .    .  |  .    .    .  |+---------------+---------------+---------------+|  .    .    .  | 178   .   168 |  12   .    .  | <-- 1|  .    .    .  |  .    .    .  |  .    .    .  ||  .    .    45 |  .   578   .  |  .    .    .  | <-- 5+---------------+---------------+---------------+|  .    .    34 |  .    .    36 |  23   .    .  | <-- 3|  .    .    .  |  .    .    .  |  .    .    .  ||  .    .    .  |  .    .    .  |  .    .    .  |+---------------+---------------+---------------+             ^               ^     ^             |               |     |            7 & 8 are in the central box.             4               6     2`

If we examine example 3 again, and do a substitution of 3 in place of 1 in the three cells in row 4, then we can still use the same logic to eliminate 3 from the rest of row 4 because we can consider the two 3s as different candidate values when the corresponding containing units (the Ui) do not intersect.

So we can relax somewhat condition 5 to become:
5'. If i != j, then no corresponding Ui of condition 4 can intersect any corresponding Uj. If there is more than one possible Ui or Uj, then all must comply.

Can anyone see anything wrong with this?

Now all I have to do is find an efficient algorithm to spot these cases....
Pep

Posts: 12
Joined: 29 November 2005

Pep wrote:Example 1 with 6 cells:
Code: Select all
`+---------------+---------------+---------------+|  .    .    .  |  .    .    .  |  .    .    .  ||  .    .    .  |  .    .    .  |  .    .    .  ||  .    .    .  | ab    .    .  |  .    .    .  |+---------------+---------------+---------------+|  .    .    cd | abcd cdef  .  |  *c   *d   .  ||  .    .    .  | abef  ef   .  |  .    .    .  ||  .    .    .  |  .    .   *ef |  .    .    .  |+---------------+---------------+---------------+|  .    .    .  |  *a   .    .  |  .    .    .  ||  .    .    .  |  *b   .    .  |  .    .    .  ||  .    .    .  |  .    .    .  |  .    .    .  |+---------------+---------------+---------------+(ab) are restricted to c4, (cd) to r4, and (ef) to box5, but the six values are in the 6 cells. The starred candidates can be eliminated.`

If 'a' and 'b' are candidates in r3c4, they must necessarily also be candidates elsewhere in row 3 and elsewhere in box 2. Therefore, by "(ab) are restricted to c4", you can't possibly mean 'a' and 'b' are candidates only in col 4 of the entire grid.

However, if one defines sets:
A = {r3c4} = {ab}
B = {r4c4,r4c5,r5c4,r5c5} = {abcdef}
C = {r4c3} = {cd}

... then 'ab' are restricted common values to sets A and B, and 'cd' are restricted common values to sets B and C. Restricted meaning ... if one set contains the value, another cannot. Is that what you mean?

I'm still trying to decide if your overall proposal is correct.
ronk
2012 Supporter

Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Next