by denis_berthier » Wed Nov 12, 2008 12:15 am
After the avalanche of posts in the past ten days, it seems necessary to recast the focus on the topic of this thread.
I'll therefore summarise here the formal framework I've defined to model the sudoku problem and the vague notion of a "pure logic" solution.
GENERAL FRAMEWORK
The mathematical backgound is fisrt order logic (FOL). For a logic game, that shouldn't be too surprising.
As there are other approaches to sudoku solving, there could be different frameworks (cover sets in Allan's approach, graph theory and colouring problems, ...) but I think FOL is the most natural one, the one closest to what players are used to do. This was already the inspiring idea in the first edition of my book and it has never changed: keeping the formal concepts as close as possible to the natural ones.
In this framework, there is a very limited number of basic concepts: cell, row, column, block, number, value, candidate. They shouldn't sound too unfamiliar.
All the other concepts are defined from the above basic ones: bivalue, bilocation, naked pair, hidden pair, swordfish, ...., nrc-link*, chain, many specific types of chains, .... This allows to give very clear and non ambiguous definitions (e.g. a degenerated quad consisting of 2 pairs is not called a quad).
A major aspect of the framework is that all the concepts used are purely factual: they describe a situation on a grid in terms of what can be observed. Patterns are observables, they are not defined in terms of what wants to do with them (as is the case with views centred on a supposed "inference level").
A resolution rule has the form "pattern => action",
where "pattern" is defined as a combination of factual concepts as those mentioned above
and "action is the assertion of value(s) and/or the elimination of candidate(s).
A resolution path is the application of a sequence of resolution rules, starting from a given puzzle and ending, hopefully, in a solution (or in a sate where no rule can be applied).
Once more, none of this this should be too unfamiliar for a player.
Of course, any concept, be it simple or complex (quad, nrc-link, nrczt-chain, ...) is always given both an easily understandable natural language definiton and a formal one in FOL.
AXIOMS
There are only the four axioms that have always been the sudoku axioms everywhere in the world since the beginning of Sudoku and that remain the constraints you can see in any newspaper or any puzzles book: only one value in each cell, only one occurrence of each number in each row, column, block.
These axioms satisfy symmetry and super-symmetry relations that can be exploited in the definition of the resolution rules.
Symmetry can also be exploited if one chooses to use the extended Sudoku board I've designed with the four rc, rn, cn and bn spaces.
Adopting rules based on the assumption of uniqueness is an option. For a player, uniqueness can only be an assumption: he has no control on it. Either he accepts this oracle or he doens't.
FAMILIES OF RESOLUTION RULES
Several times, I've mentioned that the rules I've been considering with particular attention can be classified into four large families:
1) the elementary constraints propagation rules or ECP: elimination of candidates via propagation of direct contradictions along rows, columns and blocks,
2) the basic interaction rules between rows and blocks and between columns and blocks: BI
3) the subset rules: Naked, Hidden and Super-Hidden (or fish) versions of Singles, Pairs, Triplets, Quads,
4) the xy-to-nrczt (xy, hxy, xyt, hxyt, xyz, hxyz, xyzt, hxyzt, nrc, nrct, nrcz, nrczt) family of chain rules (including chains, lassos and whips).
Of course, this doesn't imply that only these rules should be used. The framework is completely open to the addition of new rules.
I've already shown the unity of the subset rules through symmetry and super-symmetry. This group of rules can easily be extended with all the variants of fish that have been appearing in the past years.
I've also shown why all the rules in the xy-to-nrczt family can be considered as progressive and natural generalisations of the basic xy-chain rule. This includes nrc-chains, equivalent to the basic NLs or AICs (basic meaning "with no ALSs"). Although nrczt-chains subsume the whole family, the other chains are easier to find and should therefore not be forgotten. This family is thus organised in a pedagogical hierarchy.
The above rules are enough to solve almost any randomly generated puzzle (indeed, in several tens of thousands of randomly generated puzzles, I've met none that they couldn't solve).
From experiments with hundreds of puzzles taken from various forums, it appears that they are enough to solve puzzles upto SER=9.3. In particular, they are used daily by human players on the French sudoku-factory forum to solve puzzles at SER 9.1 to 9.3. They can probably solve any puzzle that any expert human player can solve and even much more.
All this shows that, in spite of the simplicity of the framework, it allows to define simple and very powerful resolution rules.
It is important to recall this, because the sequel will discuss more complex patterns and I want to make it clear that such patterns, especially braids, are not necessary but for exceptionally hard puzzles.
RULES FOR EXCEPTIONALLY HARD PUZZLES
Considering the current interest for exceptionally hard puzzles, I've recently extended my set of resolution rules in two directions. Notice that such powerful extensions don't require any change to the framework.
Firstly, I've generalised my idea of additional z- and t- candidates and I've introduced a very general principle, zt-ing, that allows to define new patterns and associated new chain rules from any family FP of basic patterns: zt-whip(FP). zt-whipping is a general method (more general than the classical almost-ing) for including in chains/whips any pattern having an associated resolution rule.
In essence, the generalisation to nrczt-whip(FP) consists of allowing the right-linking candidates of a whip to be whole patterns instead of mere candidates.
If one takes FP=ECP+NS+HS, one gets the ordinary nrczt-whips.
If one takes FP=ECP+NS+HS+BI, one gets the grouped or hinged nrczt-whips.
If one takes FP = ECP+NS+HS+BI+SubsetRules, one gets a new family of chain patterns, whip(ECP+NS+HS+BI+SubsetRules), more general than nrczt-chains, AICs with ALSs and than their grouped or hinged counterparts. Indeed, these whips contain any AAAALS, AHHHHS and any AAAAAA-Fish, with as many A's or H's as one wants.
These patterns are chains, in exactly the same sense as any nrczt-chain or any AIC. They can be considered as defining a new set of levels in the hierarchy defined by family 4.
Such generalisations of nrczt-chains have also already been used in the sudoku-factory forum (see solutions by Caravail, Abi and Christian Barrucchi) - although in these cases ordinary nrczt-whips were enough.
Secondly, with the introduction of nrczt-braids, I have added a fifth family, composed of a (relatively mild) kind of nets, still with some linear structure.
The above two extensions can be combined, leading to zt-braids(FP) for any family of patterns with associated resolution rules.
I've also shown the close relationship between braids and Trial-and-Error (T&E) - the standard T&E procedure (which is not itself a resolution rule) with no guessing and no recursion: for any family FP, any elimination that can be done by T&E(FP) can be done by some zt-braid(FP). For players who use T&E for hard puzzles, this should soften their pain when they read that T&E is the abomination. The only thing that remains the abomination in this framework is guessing.
All the above chain/whip/braid patterns have two very important properties (defined in detail in the first posts of this thread), helpful in practice for finding them:
- they are non-anticipative, i.e. the validity of a partial whip/braid depends only on the previous candidates (and it can therefore be checked on-the-fly)
- they are composable, i.e. partial whips/braids can be linked together to make longer ones, with obvious compatibility conditions at the junction.
* Added 11/13/08
Definition: two candidates n1r1c1 and n2r2c2 are nrc-linked if they are different and:
- either n1=n2 and the 2 rc-cells r1c1 and r2c2 "see" each other (i.e. are in the same row, column or block)
- or n1 <> n2 and r1c1 = r2c2
"nrc-linked" defines the purely factual situation corresponding to the existence of a direct contradiction between 2 candidates. It is the main super-symmetric auxiliary predicate built on the basic ones.