This is an improved and extended version of my first post in this thread. As everybody is looking for solutions based on "pure logic", I think my Trial and Error theorem should raise some interest.
FIRST ORDER LOGIC (FOL)First order logic (FOL), also called "predicate calculus", is the background of the framework described here. If you don't know anything about it, an elementary reference is:
http://en.wikipedia.org/wiki/First-order_logicFirst order logic should not be confused with common sense logic. One thing you should keep in mind is that if something is not illogical in the commonsense of this word, it doesn't entail it can be expressed in FOL.
The reason is that FOL has a strict formalism.
A first order theory is restricted to a well defined topic, which you may discuss only in a well defined language.
In particular, it doesn't include other mathematical theories, such as number theory or set theory, and
FOL doesn't allow arbitrary mathematical operations.
As Algebra, Differential Geometry or lots of other topics, Mathematical Logic is a specialised field of mathematics. Most of the mathematicians are very specialised in their own field and they don't know much about other fields. Moreover, generally, they don't know anything about FOL.
The reason is that they work within the framework of "naive set theory", where they can build sets of subsets, sets of functions and so on, they can use quantifiers on functions, subsets, subsets of subsets and so on, and they don't want to be restricted in these possibilities.
But
FOL doesn't allow such constructions.
FOL appears naturally as the proper formalism for Sudoku because everybody is looking for solutions using only "pure logic".
FOL provides the kind of restrictions able to give a precise meaning to the idea that a solution has to be obtained by "pure logic".
RESOLUTION RULES AND RESOLUTION PATHS
Definition: A resolution rule is a formula of first order logic (or predicate calculus), in the language of Sudoku, that satisfies the two conditions:
- it is a logical consequence of the Sudoku axioms,
- it is written in the condition => action form, where "action" can only be the assertion of value(s) and the negation of candidate(s) in some cells associated to the pattern (the target cells).
The condition is also called the pattern of the rule.The language of Sudoku is built on a very limited number of primary predicates and associated atomic formulæ:
n1 = n2
r1 = r2
c1 = c2
b1 = b2
value(n, r, c)
candidate(n, r, c)
where n, r, c,… are variables (and = is the usual equality sign).
This is a natural modelling choice, justified by the fact that they constitute the minimum common to all players.
According to Occam's razor principle, any addition to these basic predicates should be justified by some necessity.
Auxiliary predicates can of course be defined. Formally, they are just formulæ with free variables.
For instance, share-a-unit(r1, c1, b1, r2, c2, b2) is defined as: r1 = r2 or c1 = c2 or b1 = b2
Other predicates that can be easily defined are:
- bivalue
- bilocation (or conjugate)
- nrc-linked
- nrc-conjugate,…
If the above definition, the condition part (which is also a FOL formula in itself) describes a possible situation on a grid, e.g. a Naked-Triplets, an xy4-chain, an nrczt5-chain, …
A resolution rule is proven once and for all, and the way it has been proven, by a any valid mathematical methods, has no impact of any kind on its logical validity.In particular, whether or not reasoning by cases has been used in the proof is totally irrelevant to its validity.
Being logically valid doesn't prevent a rule from being more or less easy to apply, but this has nothing to do with the way it has been proven. The problems
- of proving a rule once and for all (generally these proofs are very straightforward)
- and of spotting the occurences (instantiations) of its defining pattern on a real grid
are totally independent.
Definition: A resolution path is a sequence of occurences of resolution rules with their conclusions - with no additional ad hoc pieces of reasoning. A resolution path is thus the mathematical proof of the solution.Unfortunately, examples given in forums very rarely satisfy this definition, although they often could.
This is mainly due to:
- ambiguous examples with incomplete resolution paths and inadapted notations,
- a generalised confusion, on this forum and others, between the description of a purely factual situation (e.g. the existence of a general link in my sense, i.e. a unit being shared between two cells, or of aconjugacy link) and the way this can be used in an inference step. In this regard, the notions of strong and weak links are certainly the most confusing ones that can have been introduced - leading e.g. to hyper-realistic debates on whether a strong link is also a weak link.
The above definitions don't have to be understood in an integrist sense - which doesn't mean they must be relaxed:
1) A resolution rule can be written formally in FOL, but it can also be written in plain English. In this case, to be sure there is a real resolution rule behind it, the only three things you have to be careful about is:
- respect the condition => action form;
- in the condition part, or when you define new auxiliary predicates from the elementary ones (e.g. a new type of chain), use only boolean combinations and quantifications; always use only precise factual conditions (and not undefined concepts such as weak or strong link);
- in the action part, put only assertion of values and deletion of candidates; there can be no "or".
2) Logical formulæ can use auxiliary predicates, which can be given any convenient intuitive form.
The graphical representations I use for chains ARE logical formulæ - and they are very short.
As an elementary example, here are the formulæ for an xy4, and hxy-rn4 and an hxy-cn4 chain:
rc /- {1 2}* - {2 3} - {3 4} - {4 1}*
rn/- {1 2}* - {2 3} - {3 4} - {4 1}*
cn/- {1 2}* - {2 3} - {3 4} - {4 1}*
3) Basic AICs (what I call nrc-chains) are resolution rules.
Of course, xy, xyt, xyzt, xhy, hxyt, hxyzt, nrc, nrct, nrcz and nrczt chains are resolution rules.
4) As soon as you consider that you can shift the rows or columns, you have variables and you are already doing FOL (and not propositional calculus). There is no difference between Math and Logic in the use of variables. A variable in a formula is the same thing as a variable in a polynomial.
5) There is no reason to broaden the definition of a resolution rule (e.g. by accepting "or statements" in the action part; that would radically change the framework).
The problem is not to extend the definition but to find additional, not necessarily formal, conditions a valid resolution rule should satisfy in order to be useful in practice.
RESOLUTION RULE vs RESOLUTION TECHNIQUE vs REPRESENTATION TECHNIQUEWhereas I have given the phrase "resolution rule" a precise, purely logical definition, it is not yet the case for the word "technique". First, I think two different kinds of techniques should be defined.
A "resolution technique" is a procedure that can help get closer to a solution by eliminating candidates or asserting values.Some colouring techniques, depending on what one means by this, can be resolution techniques in this sense.
A "representation technique" is a representation that can be the support for several resolution techniques.Marking conjugate candidates with upper and lower case letters is a technique in this sense. It can be used for finding Nice Loops or AICs.
The rn- and cn- spaces I've introduced in HLS are representation techniques in this sense.
One important notion is that of a resolution technique being associated to or being the implementation of a resolution rule.
A given resolution rule can, in practice, be applied in various ways. Said otherwise, different resolution techniques can sometimes be attached to a single resolution rule
This distinction is important. Forgetting it may lead to the absurd assimilation of an algorithm and a logical formula - two notions that do not belong to the same universe of discourse (CS vs logic).
A first example of this has been given in the xyt-thread, where my initial technique of drawing arrows from the right linking candidate of a cell to the left linking candidate of the next cell and John's modified xy-colouring technique can both be used to spot xyt-chains.
As another alternative, within each associated technique, the search for an xyt-chain can start from a target cell (e.g. because one would like to eliminate some candidate in this cell) or from the first cell in the chain or from a previously obtained shorter chain.
Another example of this has been given in my post 319 in the "fully supersymmetriic chains" thread, where several implementations of the nrc(z)(t) chains have been described.
A formal definition is the following:
A resolution technique is the implementation of a resolution rule if for any partially filled grid with candidates, applying the technique to this grid has the same final effect (values asserted and candidates deleted) as what can be logically concluded by the resolution rule.Validity of a resolution rule is based on logic. Validity of a technique is based either on its conformity with the underlying resolution rule if there is one OR on the algorithm describing it.
For a technique that is not based on a resolution rule, it may be very difficult to guarantee that it doesn't amount to T&E.
Most, but not all, of the usual techniques could be re-written as resolution rules.
TRIAL AND ERROR
Definition: T&E is the following depth-first search procedure: recursively make ad hoc hypotheses on values that you do not know to be true at the time you make them, write all their consequences (candidates eliminated and values added), with the possibility of retracting each of these hypotheses and their consequences if they lead to a dead end.T&E is a resolution technique in the above sense.
In fact, T&E is a family of resolution techniques, depending on how you write "all the consequences" of an hypothesis, i.e. on which family of resolution rules you adopt for this. This is not really important, because such consequences are only used to prune the tree of possibilities, i.e. to restrict the search.
There may also be variants of T&E depending on how many solutions you want in case the puzzle has several: this will be defined by the exit conditions of the algorithm.
Being an algorithm, Trial and Error can obviously not be a resolution rule: the two pertain to different domains of speech (algorithmic vs logic). But we have the stronger:
Theorem: Trial and Error is a resolution technique that cannot be the implementation of any resolution ruleProof: since T&E can be applied any time a cell has at least two candidates, if it was expressible as a resolution rule, the condition of this rule could only be:
"if a cell has at least two candidates".
But it is obvious that, from such a condition, no general conclusion can be obtained (no value can be asserted and no candidate can be eliminated).
Recently, I came with a stronger version and an alternative proof.
This is an improved and extended version of my first post in this thread.
As everybody wants solutions based on "pure logic", I think my Trial and Error theorem should raise some interest.
T&E theorem (stronger version): Trial and Error is a resolution technique that cannot be the implementation of any set of resolution rules.Second proof: if a puzzle has more than one solution, T&E is guaranteed to find one (or several or all, depending on the exit condition we put on the T&E algorithm - again, T&E is a family of algorithms, and the theorem applies to any variant).
On the contrary, as a set S of resolution rules can only lead to conclusions that are logical consequences of the axioms and the entries, if a puzzle has several solutions, S cannot find any.
e.g. if there are two solutions such that r1c1 is 1 in the first and 2 in the second, S cannot prove that r1c1=1 (nor that r1c1=2). It can therefore find none of these solutions.
q.e.d.[/url]