The current Sudoku solution methods are mainly based on counting and implication chains. The part about implication chains is well understood. Basically it is just propositional calculus. Every basic step in a chain is based on counting, and some known counting arguments can be generalized and simplified a bit.
Every region contains each digit precisely once. This can be used in two directions: On the one hand it means that if some set of cells is covered by N regions, it can contain some digit d at most N times. On the other hand it means that if some set of cells contains all occurrences of some digit d in N pairwise disjoint regions, it must contain the digit d at least N times.
For N=1 the "at most" part leads to weak links, and the "at least" part leads to strong links. Very similar arguments work for N > 1.
Subset Counting example
Myth Jellies pushed me to start a new thread in a follow-up to a post describing such an argument - maybe the example given there is a good illustration. Look at the (partial) situation
- Code: Select all
. . . | . . . | . . .
. . . | . . <9>| . 79 789
. . . | . . . | . . 789
------------+-------------+------------
. . . | . . . | . . .
. . . | . . . | . . .
. . . | . . . | . . .
------------+-------------+------------
. . . | . . 37 | . . 79
. . . | . . . | . . .
. . . | . . 39 | . . .
The goal is to eliminate a possible 9 in cell (2,6).
Consider the set of six cells of which the candidates are shown. The possible digits are 3,7,8,9 with maximum multiplicities 1,2,1,3, respectively.
(Why? All 3s are in a single box, so at most one 3. The same holds for 8. All 7s are either in row 7 or in box 3, so there are at most two 7s in these six cells. Finally, the 9s occur in three boxes, so there cannot be more than three of them.)
Now 1+2+1+3=7 and we have 6 cells. If some outside choice decreases the maximum by 2, then we do not have enough digits to fill the cells, and the outside choice is impossible. But the choice of 9 in cell (2,6) would diminish the maximum number of 9s from 3 to 1 (all 9s that do not see cell (2,6) are in one column), and this shows (2,6)!9.
Remarks
Most reasoning involving locked sets, almost locked sets, Sue de Coq, X-wing, XY-wing, XYZ-wing etc is a special case of this subset counting technique.
Also some arguments commonly thought of as chain arguments can be phrased as counting arguments. In particular, remote pairs and, more generally, bivalue forcing chains are a special case.
The contradiction obtained is that an upper bound (set covered by few regions) contradicts a lower bound (set has at least as many digits as cells). One can also obtain a lower bound by counting for each digit d the number of pairwise disjoint regions of which all occurrences of d are contained in the chosen set.
Now e.g. swordfish, and all ordinary nice loops are a special case.
(Thus, one has the two equivalent views of a nice loop: as a chain of propositions or as a single object.)
(How? Given the chain x-a=b-c=d-e=f-x, consider the set a,b,c,d,e,f. Counting (digit,region) pairs there is a lower bound of 3 for the three regions containing a=b, c=d, e=f. But if x holds, then there is an upper bound of 2 found by considering b-c, d-e. One can also count true propositions and come to the same conclusion.)
Chains are basically linear, but subsets do not have such a restriction. This makes subset counting in principle a more powerful technique.
Of course, finding the right subset to consider is not necessarily easier than finding the right chain.
If the set of cells is large, it can become nontrivial to determine the maximum number of occurrences of some digit in the set. The extreme case is that where one takes all cells that do not see a given cell. If the maximum number of occurrences of d is less than 8, then the given cell does not have d. This is Nishio.