REDUCING THE NUMBER OF STEPS
1) TOPIC OF THIS THREAD
This thread is about finding a resolution path with a reduced number of steps (compared e.g. to a resolution path obtained by the simplest-first strategy).
To make things clear: any well defined resolution rules should be totally independent of the strategy used to put them to work (which order you choose to apply them when several of them can be applied, according to which criteria...). This should be obvious from their definitions (and it should also be obvious to anyone using CSP-Rules and probably also any other interactive solver). Fortunately, this is true of all the resolution rules currently used on this forum.
Given a set of resolution rules T (= a resolution theory), my "standard" simplest-first strategy has the main advantage of computing the rating of a puzzle (wrt to this set of rules) but it tends to produce many steps, especially if various types of rules are activated. This may be seen as a second advantage (as it shows many possibilities a manual player could use) or as a disadvantage, depending on one's goals. It is NOT the purpose of this thread to start a discussion about this point.
Moreover, I've always said that the simplest-first strategy is for a computer program and that a human solver doesn't have to follow it (and in general will not); so please, don't waste our time by trying to push at open doors.
To re-iterate and elaborate, this thread is about the techniques allowing to reduce the number of steps wrt to a fixed, properly chosen set of resolution rules.
There are two special cases that can (more or less) be excluded from our considerations here: 1-step and 2-step solutions. In such cases, a systematic review of all the possible 1or2-step paths is possible (see all the puzzles in this forum where I and others have proposed such solutions; see also CSP-Rules user manual (https://github.com/denis-berthier/CSP-Rules-V2.1 or https://www.researchgate.net/profile/De ... r/research) for how to do it.
Unfortunately, in case no such 1or2-step solution exists, a systematic exploration of all the resolution paths is impossible in practice, due to combinatorial explosion. In this case, any "fewer steps" algorithm has to choose each step "randomly" among the "most promising ones" selected among all the available ones.
Being based on random choices, the final result depends on how many tries one makes and no one can ever be sure that a shorter path doesn't exist.
As a result of the above remarks, the heuristics for defining "promising steps" may in practice be the main topic of this thread. Illustrating how they work on particular examples is also part of this thread.
AFAIK, the idea of reducing the number of steps using some random choices among those selected as "promising" is due to François Defise.
2) THE GENERAL FRAMEWORK
Let me now make all this slightly more formal.
I'll consider two resolution theories T and T0, with T0 ≤ T (T0 included in T).
T is the set of rules we want to use to solve a puzzle.
T0 is the set of rules we consider as no-step.
<>Typical examples for T0:
BRT (basic resolution theory) = Singles + eliminations by direct contradictions between a decided value and a candidate
W1 = BRT + whips[1]
S2 = W1 + Subsets[2] (i.e. + Naked, Hidden and Supper-Hidden Pairs)
S3 = S2 + Subsets[3] (i.e. + Naked, Hidden and Supper-Hidden Triplets)
S= S4 = S3 + Subsets[4] (i.e. + Naked, Hidden and Supper-Hidden Quads)
(Super-Hidden = Fish)
Notice that I consider that only BRT and W1 can rationally be considered as no-step, especially as Subsets of any size tend to eliminate lots of candidates.
<>Typical examples for T:
T will be defined by a set of rule types and their maximal length.
To make it concrete, in CSP-Rules this means the following choices (the semi-colons mean rules that I don't usually consider, but that could be).
- Code: Select all
(defglobal ?*all-chains-max-length* = 8)
(bind ?*Bivalue-Chains* TRUE)
(bind ?*z-Chains* TRUE)
(bind ?*Whips* TRUE)
(bind ?*Typed-Bivalue-Chains* TRUE)
(bind ?*Typed-z-Chains* TRUE)
(bind ?*Typed-Whips* TRUE)
; (bind ?*G-Whips* TRUE)
; (bind ?*Braids* TRUE)
; (bind ?*Typed-Braids* TRUE)
; (bind ?*G-Braids* TRUE)
; (bind ?*Oddagons* TRUE)
In case T0 contains Subsets, T must contain the same ones (this is what T0 ≤ T means)
; (bind ?*Subsets[2]* TRUE)
; (bind ?*Subsets[3]* TRUE)
; (bind ?*Subsets[4]* TRUE)
<> Maximum allowed value for all the chains:
The maximum allowed value for all the chains (?*all-chains-max-length*) must be selected carefully. If a puzzle has a solution using only bivalue-chains[3], it would be totally absurd to look for a shorter resolution path involving any chains of length 10.
3) COMMENTS
One problem about trying to "optimise" full resolution paths is, no rating of a full path has ever been defined in any rational way. Merely adding the lengths of each step (as this is done in some generators/solvers) is total nonsense. As the complexity of a step increases "exponentially" with its size (i.e. the length of a chain or the size of a Subset), it'd be more rational to consider a function such as (sum (exp(length, a)), where a is some parameter to be determined. Anyway, that'd be totally useless, as it is impossible to use it in practice for any path optimisation.
The (much more modest) approach considered here almost totally avoids such problems. By carefully selecting the allowed maximum length of all the chains (based on the simplest-first solution), only the number of steps has to be taken into account.
IMO, this allowed maximum length shouldn't be much more than the maximum length reached in the simplest-first solution (i.e. equal to it in the best case and not more than it +1 or +2 in any case).
4) EXAMPLES
First examples can be found here (one from Defise one from me)
http://forum.enjoysudoku.com/post299359.html?hilit=few%20steps#p299359
http://forum.enjoysudoku.com/pisces-9-0-t39227.html#p307092
More examples forthcoming.