## Constraint subsets

Advanced methods and approaches for solving Sudoku puzzles

### Constraint subsets

I've been giving a lot of thought to a discussion begun on the programmers' forum, and I've made a rather startling insight. I'd like to present a technique I call constraint subsets, and another, multi-box-multi-line interactions, which appears only in larger sudoku. This method was inspired by skewed swordfish, but it turns out that within a box, you don't have to worry about whether the box has only 3 values in a 3x3 pattern. Constraint subsets are the grandfather of all subsets, including positional subsets like X-wings and swordfish, and also cover all box-line interactions.

I'm using the term "constraint" here to mean what it does in dancing links: That is, the digit within a cell, or the position of a digit in a box, column, or row.

The technique goes like this: If you have a set, A, of N constraints which do not have the same placements, and they share the same N other constraints (set B), you can eliminate all choices in B which do not occur in A.

In clearer language: If for a digit you have N rows and/or boxes, whose values all occur in the same N columns, and those values do not occur twice in the pattern, you can eliminate all other values in the common columns. (The rule that a value may not occur twice is like this: If your pattern has row 2, row 4, box 5, and box 7 all intersecting columns 1, 2, 4, and 6, this technique does not work if one of the candidates in the pattern is at r4c4; it's in both row 4 and box 5.) Allow me to illustrate.
Code: Select all
`. . *|. * .|* . ** . .|* . .|* * *a . .|. . .|b . c-----------------* . *|. * .|. * .d . .|. . .|e . f* * *|. * *|* * *-----------------. . *|* * .|g . h* * .|. . *|i . j. * .|. . *|k . .`

Let's say these are all the possibilities for digit 6. Here the pattern has row 3, row 5, and box 9. These cover columns 1, 7, and 9. No matter where a 6 appears in box 9, it will fill either column 7 or column 9. That leaves an X-wing with rows 3 and 5 in either columns 1 and 7, or 1 and 9. The end result is clear: Columns 1, 7, and 9 can be cleared of all other unlabeled values.
Code: Select all
`. . *|. * .|- . -- . .|* . .|- * -a . .|. . .|b . c------------------ . *|. * .|. * .d . .|. . .|e . f- * *|. * *|- * ------------------. . *|* * .|g . h- * .|. . *|i . j. * .|. . *|k . .`

The important factor in recognizing the proof of this pattern is this: If you pin down a value in any of the boxes in the pattern, one column is filled; eventually this reduces you to a simple X-fish. Any value you pin down in one of the rows from there also fills in one column. Naturally you can do this with columns and boxes intersecting common rows, as well.

Now here's the other cool technique I mentioned: multi-box-multi-line interactions. This only occurs in sudoku, in the absence of lesser techniques, where you can have 4+ boxes lined up. That means it can happen in an 8x8 or anything larger than a 9x9. Consider the pointing pair/triple: If all possibilities for a digit in a given box are all in the same line, the rest of that line can be eliminated. You can use the logic of subsets to say: If all possibilities for a digit in 2 boxes are all in the same 2 lines, the rest of the choices in those 2 lines can be eliminated. So, you can have multiple pointing sets, or multiple box-line intersections. These too are simple constraint subsets: A standard box-line interaction is merely a constraint subset size 1.
Lummox JR

Posts: 125
Joined: 22 September 2005

### Re: Constraint subsets

The constraint subsets technique is cool. I would like to find an actual grid where this technique can be applied. Thanks.

The 'multi-box-multi-line interactions' technique appears to be same as the 'locked candidates' technique in Angus's solver. Is it not?
Jeff

Posts: 708
Joined: 01 August 2005

### Re: Constraint subsets

Lummox JR wrote:The rule that a value may not occur twice is like this: If your pattern has row 2, row 4, box 5, and box 7 all intersecting columns 1, 2, 4, and 6, this technique does not work if one of the candidates in the pattern is at r4c4; it's in both row 4 and box 5.

I agree the constraining sets must be disjoint. Using your illustration:
... and broadening the possibilities in the box, we have ...
Code: Select all
`. . *|. * .|- . -- . .|* . .|- * -a . .|. . .|b . c------------------ . *|. * .|. * .d . .|. . .|e . f- * *|. * *|- * ------------------. . *|* * .|g . h- * .|. . *|i . j. * .|. . *|k . m`

Given the equations for the constraining sets:
a+b+c=1
d+e+f=1
(g+i+k)+(h+j+m)=1 (an ell looks too much like a one)

... the inequalities for the intersecting sets ...
a+d<=1
b+e+(g+i+k)<=1
c+f+(h+j+m)<=1

... can be proved to be ...
a+d=1
b+e+(g+i+k)=1
c+f+(h+j+m)=1

... precisely because the constraining equations are independent, in other words, because the "constraining sets" are disjoint.

BTW, I prefer sequentially lettering the candidates in the box from top to bottom ... so as to end up with sequential letters (g+h+i) and (j+k+m) in the equations.

Is that illustration based on an actual puzzle?
ronk
2012 Supporter

Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

### Re: Constraint subsets

Jeff wrote:I would like to find an actual grid where this [ed: constraint subset] technique can be applied.

Puzzle #664 of the top870 is one I presented here. I have a small library of other examples
ronk
2012 Supporter

Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Puzzle #833 of the top870 is an example with columns instead of rows. Using simple techniques and "forward implication chains", one can get the original puzzle ...
Code: Select all
` *-----------* |...|.8.|...| |...|9..|...| |9.5|3..|486| |---+---+---| |.1.|...|...| |.4.|.68|..2| |...|1..|69.| |---+---+---| |...|.9.|.67| |6.8|..3|2..| |..2|...|1..| *-----------*`

... to this ...
Code: Select all
` *-----------* |...|.8.|9..| |...|9..|...| |9.5|3.1|486| |---+---+---| |.16|..9|...| |.49|568|.12| |...|1..|69.| |---+---+---| |...|.9.|.67| |6.8|713|2..| |..2|...|1..| *-----------*`

...with candidates ...
Code: Select all
`12347  2367   1347   | 246    8      24567  | 9      2357   135   123478 23678  1347   | 9      2457   24567  | 357    2357   135   9      27     5      | 3      27     1      | 4      8      6     ---------------------+----------------------+--------------------258    1      6      | 24     2347   9      | 3578   3457   3458  37     4      9      | 5      6      8      | 37     1      2     258    258    37     | 1      2347   247    | 6      9      3458  ---------------------+----------------------+--------------------1345   35     134    | 248    9      245    | 358    6      7     6      59     8      | 7      1      3      | 2      45     459   3457   3579   2      | 468    45     456    | 1      35     3589  `

... with pinned and candidate 7 locations at ...
Code: Select all
` *  *  a  | .  .  d  | .  k  .   *  *  b  | .  *  e  | g  h  .   .  7  .  | .  7  .  | .  .  .  ----------+----------+---------- .  .  .  | .  7  .  | 7  7  .   7  .  .  | .  .  .  | 7  .  .   .  .  c  | .  *  f  | .  .  .  ----------+----------+---------- .  .  .  | .  .  .  | .  .  7   .  .  .  | 7  .  .  | .  .  .   7  7  .  | .  .  .  | .  .  .  `

Three sets (columns 3 and 6 plus box 3) constrain digit 7 to three rows (rows 1, 2, and 6). Therefore all other candidate 7s in those rows may be eliminated (the *s).

The "forward implication chain" eliminations to get to the above were r5c1#5, then r5c4#7 and r8c4#4.

**************
Another example is puzzle #204 of the same top870. After several simple steps, pinned and candidate 6s are as follows:
Code: Select all
` .  .  .  | .  6  .  | .  .  .   *  .  6  | .  .  .  | 6  *  .   a  .  .  | .  .  .  | .  b  c  ----------+----------+---------- *  6  6  | .  .  .  | .  g  j   *  .  6  | .  .  .  | .  h  k   .  .  .  | .  .  6  | .  .  .  ----------+----------+---------- .  6  6  | 6  .  .  | 6  *  .   d  .  .  | .  .  .  | .  e  f   *  6  .  | 6  .  .  | 6  *  .  `

The constaining sets are in row 3, row 8, and box 6 ... with the eliminations again shown with the asterisk.
ronk
2012 Supporter

Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Jeff, I suspect multi-box-multi-line interaction is not the same as locked candidates in Angus's solver, though I haven't used it. If the terminology matches that of Trebor's Sudoku Susser, then locked candidates are a simple naked subset. Multi-box-multi-line will not appear in a 9x9 because standard box-line interaction appears instead.

And nope, my example isn't from an actual puzzle, but is built in such a way that it's plausible.
Lummox JR

Posts: 125
Joined: 22 September 2005

It occurred to me last night that there's another kind of constraint subset worth mentioning that I did not initially explore. In this kind you have N rows and columns, whose values do not overlap and all occur within N boxes. Observe:
Code: Select all
`. . *|+ . .|* . .* + .|+ * *|. * .+ . +|. + .|. . .-----------------+ . +|. + +|. . .. + .|. * .|. * .. + *|+ . *|* * .-----------------* . *|. . *|. . .. . *|. * .|. . .. . .|. . .|. . .`

Here the +'s represent the choices that are part of the pattern, and *'s represent all others. Columns 2 and 4, rows 3 and 4, have all of their possible values within boxes 1, 2, 4, and 5. Note also that these columns and rows do not have any choices available at their intersections, so we adhere to the no-overlap rule. By the logic of constraint subsets, all other choices in boxes 1, 2, 4, and 5 may be eliminated.

To consider it in more depth: Picture what would happen if r2c1 was a valid choice. Suddenly box 1 has been accounted for. The four columns and rows must now have placements within only three boxes. This is a problem, however, because the fact that they don't overlap means that if you fill one of those columns or rows, you do not fill another. One of the columns or rows from the pattern must then have no choices at all, which is an inconsistency. r2c1, therefore, is not a valid choice. The same logic applies to all the other asterisked cells in those boxes.
Code: Select all
`. . -|* . .|* . .- * .|* - -|. * .* . *|. * .|. . .-----------------* . *|. * *|. . .. * .|. - .|. * .. * -|* . -|* * .-----------------* . *|. . *|. . .. . *|. * .|. . .. . .|. . .|. . .`
Lummox JR

Posts: 125
Joined: 22 September 2005

Lummox JR wrote:there's another kind of constraint subset [ed: where] you have N rows and columns, whose values do not overlap and all occur within N boxes.

Good find. That all looks valid to me ... somewhat like four variables and four independent equations in algebra.

I would be inclined to say N/2 rows and N/2 columns, making N=4 for your illustration ... a "constraint jellyfish" so to speak.
ronk
2012 Supporter

Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Interesting, this last one is the opposite of a similar pattern.

Your is: if you have 2 row and 2 columns which are covered 4 boxes and whose 4 intersections are empty, then the other cells in the 4 boxes can be excluded.

The other is: if you have 2 rows and 2 columns which intersect 4 boxes and the other cells in the 4 boxes are empty, then the 4 intersections can be excluded. [edit: the cells in the 2 rows and columns outside of the 4 boxes can be excluded as well]

E.g.

Code: Select all
`. . .|+ . .|* . . . + .|+ . .|. * . + + +|+ + .|+ + . ----------------- + . +|+ + +|. + + . + .|. . .|. * . . + .|+ . .|* * . ----------------- * + *|. . *|. . . . . *|+ * .|. . . . + .|. . .|. . .`

becomes

Code: Select all
`. . .|+ . .|* . . . + .|+ . .|. * . + - +|- + .|- - . ----------------- + . +|- + +|. - - . + .|. . .|. * . . + .|+ . .|* * . ----------------- * - *|. . *|. . . . . *|- * .|. . . . - .|. . .|. . .`
Last edited by Nick70 on Thu Nov 24, 2005 4:21 am, edited 1 time in total.
Nick70

Posts: 156
Joined: 16 June 2005

Another aspect of this technique I didn't explore in my original post is that you can have either set, A or B, be the set that mixes two types of constraints. (In theory you could do both with a constraint problem, but I'm not sure that's possible in sudoku. I don't even think you can mix the "digit in cell" constraints into this unless you're just using simple subsets.) To generalize the entire technique: If you have a set of N constraints A, whose choices only appear in a set of N constraints B, and those choices are found in only one member each of A and B, then you can eliminate all other choices in B.

Consider the following:
Code: Select all
`. . *|* . .|. . .. . .|. * .|. . *. + *|. * *|+ + *-----------------. + .|* . .|+ . *. . *|. * .|. . .* . .|* * *|. . *-----------------. . .|. * .|+ . *. . *|* . .|+ + ** . *|. . .|. + .`

Here all choices in columns 2, 7, and 8 are in row 3, row 4, and box 9. You can therefore eliminate the other choices in row 3, row 4, and box 9.
Code: Select all
`. . *|* . .|. . .. . .|. * .|. . *. * -|. - -|* * ------------------. * .|- . .|* . -. . *|. * .|. . .* . .|* * *|. . *-----------------. . .|. * .|* . -. . *|* . .|* * -* . *|. . .|. * .`

There is also an inverse form of the box-based constraint subset I posted earlier.
Code: Select all
`. + .|. . +|. * .+ . +|+ + .|* * .. + .|. . +|. . .-----------------. + .|. . +|* . .+ . +|+ + .|. * *. + .|. . +|. . .-----------------* * .|. . *|. . .* . .|. . *|. . *. * .|. * .|. * .`

Here all choices for boxes 1, 2, 4, and 5 appear in columns 2 and 6 and rows 2 and 5. No choice appears where the columns and rows intersect, obeying the overlapping rule. You may then eliminate all other values in columns 2 and 6, rows 2 and 5.
Code: Select all
`. + .|. . +|. * .+ . +|+ + .|- - .. + .|. . +|. . .-----------------. + .|. . +|* . .+ . +|+ + .|. - -. + .|. . +|. . .-----------------* - .|. . -|. . .* . .|. . -|. . *. - .|. * .|. * .`
Lummox JR

Posts: 125
Joined: 22 September 2005

Nick70 wrote:The other is: if you have 2 rows and 2 columns which intersect 4 boxes and the other cells in the 4 boxes are empty, then the 4 intersections can be excluded.

I think that works only if the other cells in the 2 columns and 2 rows are also empty. Then the intersections would always violate any attempt to satisfy all 4 boxes and all 4 of the columns and rows. It's an interesting corollary, but it only applies to the case where all of the constraints are otherwise filled; then and only then is it safe to eliminate the overlaps.
Lummox JR

Posts: 125
Joined: 22 September 2005

Lummox JR wrote:I think that works only if the other cells in the 2 columns and 2 rows are also empty.

You mean outside of the 4 boxes? No, they don't have to be empty.
Nick70

Posts: 156
Joined: 16 June 2005

Hrm, I see that the intersections can be eliminated, but not necessarily by a corollary to this technique. For a corollary to apply the choices would have to be constrained both to set A and set B. Otherwise I just don't see any set logic related to this technique that forces out those intersections. If you can find a way to do that though I'd love to hear it, because it would probably open some doors.

What I did realize about the intersections is that they can be eliminated by a hidden pattern that was originally found using templates. I finally realized the similarity when I tried to place one of the intersections in your example. You'll notice the pattern eliminates both the intersections and the target rows and columns outside the four boxes.

With constraint subsets we've come a step closer to understanding that pattern. The only thing missing is whatever you found to remove the intersections, because if it can be applied at all to other constraint subsets then this technique might be a great deal more powerful than I thought.
Lummox JR

Posts: 125
Joined: 22 September 2005

Hello, who can tell me how to make a nice, formated "code".
Thanks.
Guest

Posts: 312
Joined: 25 November 2005

Lummox JR wrote:There is also an inverse form of the box-based constraint subset I posted earlier.

Inverse? In the mathematics of set theory, is this really a corollary to either multiplicative inverse or additive inverse of other mathematical disciplines?

I don't know the answer to that question, because I don't know set theory ... but suspect it's actually a principle of reciprocity that applies here. And it may be that (not yet undefined here) principle of reciprocity that eliminates the candidates at the intersections of the rows and columns in Nick70's illustration.

Keep up the good work. You're definitely on a roll.
ronk
2012 Supporter

Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Next