## Pattern-based classification of (hard) puzzles

Advanced methods and approaches for solving Sudoku puzzles

### Re: Pattern-based classification of (hard) puzzles

logel wrote:Current rating is rating of eliminations in the first place.

Yes; rating of the hardest elimination (wrt to a given set of rules). As I mentioned previously, nobody has ever been able to propose any rating of full paths.

logel wrote:But each elimination consists only of a series of secondary eliminations on the next recursion level.

No, absolutely no. Each elimination step in all my resolution paths is complete in itself. It includes no "secondary eliminations".

logel wrote: On the road to simple resolution rules you already arrived at the bottom, as obviously T&E(2) with only single eliminations can solve any 9-9-Sudoku.

Obviously?
This claim first supposes a clear definition of T&E - which was inexistent before I gave one. There were even people to confuse T&E with guessing.
Even after this, there isn't yet any formal proof that T&E(2) is enough for all the puzzles. It remains a conjecture (although a very strongly motivated one, especially after my B?B classification results of the hardest known puzzles).

But the goal of the resolution rules is to replace the T&E procedure by pure logic solutions. The concrete effect is, the output of T&E is an eligible text, tens of pages long, whereas the resolution path with e.g. braids is a sequence a fully justified eliminations.

logel wrote: Any other rule set just hides recursion steps inside the rule definitions and therefore shift some Sudoku to T&E(1).

This is not how I see it. Resolution rules are not added to T&E. They replace it.
Adding a new rule to a resolution theory can modify the various ratings and/or classifications of some puzzles and yes, some of them can pass from T&E(2) to T&E(1). But the rule doesn't hide anything.

logel wrote:The pattern ronk displays may have a value for solution. I dont share your view that a T&E(2) pattern is generally more complex than a T&E(1) pattern, only because it contains one or two extra branches.

Where am I supposed to have exposed such a view? The rough T&E(0) vs T&E(1) vs T&E(2) classification applies to puzzles, not to patterns.
You are here using the word "complex" in an undefined sense, probably wrt to the ease of detection by a player - which is very different from the (relative) sense in which I use it: occupying some well defined place in a well defined hierarchy of rules.
A T&E(2) puzzle is more complex than a T&E(1) puzzle because, in my hierarchy, Bp-braids are, in a natural way, logically more complex than braids.

The discussion about Ronk's pattern was not about whether it has any value (AFAIK, it appears very rarely, so that it should have limited value) but about whether it should be called an sk-loop. As its proof requires some form of reasoning completely different from the proof of the sk-loop - a proof that can't follow any homogeneous chain/loop structure - my answer is negative. Contrary to ronk, I consider it is NOT at all in the spirit of the sk-loop.
As for the classification of this pattern (or the sk-loop itself) in my hierarchies, I need no xsudo bling graphics; it could be seen, in an obvious way, as a mere gSubset[16] for 16 CSP variables of type rc. The problem with this approach is, if one wants to consider it as a gSubset[16] instead of some kind of chain structure (e.g. an x2y2-belt), thus loosing the chain "spirit" of Steve's initial rule, then, to be consistent, one should also accept all the smaller gSubsets. Considering the mess that gSubsets[4] or [5] are (see the "ultimate Fish" thread, where they appear as Franken and Mutant Fish and their variants - with nothing at all that could legitimately be called "ultimate"), I wish good luck to anyone trying to classify these beasts. [The case would be still much worse for those who'd use the xsudo approach with its non-o-rank possibilities].

To show why a pattern can't be classified itself as T&E(1) or T&E(2), consider the simpler case of ordinary (Naked + Hidden + SuperHidden) Subsets: they can appear in any puzzle, the simplest or the most complex ones. (What's interesting here is that they seldom change the W complexity of a puzzle.)

logel wrote:The braid hierarchy B(n)B is fine for defining sets of solvable Sudoku, but questionable as a complexity scale.

Once more, "complexity" has no predefined meaning. The BpB hierarchy is closely related to any intuitive notion of complexity because, in the mean, computation times grow exponentially with p. But this argument is only an a posteriori justification.
The BpB hierarchy allows to see easily how the addition of a given rule can change the remaining B?B complexity of a given puzzle.
In ronk's case, it shows that it doesn't change the B?B complexity; it'd be interesting to have more examples of this pattern and to see its impact on them.
Moreover, if we had an unbiased collection of hard - in a broad sense, i.e. T&E(2) - puzzles, we could even compute how, in the mean, a given rule, such as the sk-loop, can change their B?B complexity.
denis_berthier
2010 Supporter

Posts: 1258
Joined: 19 June 2007
Location: Paris

### Re: Pattern-based classification of (hard) puzzles

denis_berthier wrote:
logel wrote:Current rating is rating of eliminations in the first place.

Yes; rating of the hardest elimination (wrt to a given set of rules). As I mentioned previously, nobody has ever been able to propose any rating of full paths.

logel wrote:But each elimination consists only of a series of secondary eliminations on the next recursion level.

No, absolutely no. Each elimination step in all my resolution paths is complete in itself. It includes no "secondary eliminations".

I already mentioned that my aim is to to find a way compare solution pathes. This requires to rate full pathes and I am very much aware that its a challenge. You seem to think its impossible, I have at least some hope. There are some fundamental ideas, a bit connected with the misleading term "secondary eliminations".
-----
- A sudoku with n givens may or may not have solutions.
- There is no other way to know about the solvability status than to attempt solving.
- Solvable and unsolvable sodoku are undistinguishable before applying solver rules.
- The methods used are exactly the same in both cases.
- The only difference is that an unsolvable sodoku runs into a contradiction and otherwise not.
- Any (unique) sudoku is related to a set of derived sudoku with one more given replacing one candidate.
- All these derived sudoku can be solved until a contradiction, if
a) candidate is FALSE finally
b) the rule set is powerful enough
- This is true the the same way for any later knowledge state.

This shows that ANY elimination is equivalent to a full solution path of one of the derived sodoku. The path consists of singles, pairs, etc and is in its structure not different than any other solution path. Eliminations and full solution pathes are made of the same elements. So rating of elimations and solution pathes must be the same in a consistent way. This is violated with the current approaches. Eliminations seem to live in a different world.

Eliminations have no life on their own, they just reperesent an intermediate logical statement (VARxy NOT EQUAL n). Their only purpose is that the logical statement is reused in subsequent calculations to make them less redundant. You can of course verify a variable straight away without stopping at eliminations, but this leads to redundant sub-pathes. So its all about redundancy.

At this point it becomes funny and I was unable to find a rating concept that works. The main reason is that there are more intermediate logicate statements than eliminations being reused somehow. More thought required.

denis_berthier wrote:
logel wrote: Any other rule set just hides recursion steps inside the rule definitions and therefore shift some Sudoku to T&E(1).

This is not how I see it. Resolution rules are not added to T&E. They replace it.
Adding a new rule to a resolution theory can modify the various ratings and/or classifications of some puzzles and yes, some of them can pass from T&E(2) to T&E(1). But the rule doesn't hide anything.

Uhh, needs more detailed explanations, but I am running out of time ......

I will be on a boat trip to Sweden the next weeks. So if you like to continue discussing what is already drifting away from your thread intention, it has to wait.

logel

Posts: 55
Joined: 29 April 2012
Location: Germany

### Re: Pattern-based classification of (hard) puzzles

logel wrote:- Any (unique) sudoku is related to a set of derived sudoku with one more given replacing one candidate.
- All these derived sudoku can be solved until a contradiction, if
a) candidate is FALSE finally
b) the rule set is powerful enough
- This is true the the same way for any later knowledge state.

Your condition b on the set of resolution rules makes your assertions tautological.

logel wrote:This shows that ANY elimination is equivalent to a full solution path of one of the derived sodoku. The path consists of singles, pairs, etc and is in its structure not different than any other solution path.

Any particular resolution path using the set of rules R for a "derived" sudoku AND leading to a contradiction is equivalent to a generalised braid, an R-braid, eliminating a candidate (provided that R has the confluence property).
But this needs to be proven, e.g. as I've done it in my last book "Constraint Resolution Theories".

logel wrote:Eliminations and full solution pathes are made of the same elements.

This is your main error, and it explains why you get blocked : if solution paths of the derived sudokus use the rules in R, then the corresponding eliminations of the original sudoku have to rely on R-braids, i.e. on much more complex patterns.

logel wrote:I will be on a boat trip to Sweden the next weeks.

Have a nice trip
denis_berthier
2010 Supporter

Posts: 1258
Joined: 19 June 2007
Location: Paris

### Re: Pattern-based classification of (hard) puzzles

Before answering your last reply (another post) I want to come back to the original theme of this thread. During sailing there is plenty of time to think, so some new ideas were spinning through my head.
The braid rule system is very powerful, so I feel that there are not many other equally powerful rule sets to compare with.

The following defines pattern classes that make a rule set comforming to your requirements.

- Any elimination can be seen as a set of candidates divided into two subsets. One subset of min two candidates that are directly linked to one ore more targets, where at least one candidate of this set is true for all valid permutations. The other subset is the remaining candidates that form the connection logic. Note: This excludes patterns that rely on uniqueness. I regard them as not correct and unusable anyway.
- You can assign a number T to each elimination set, where T is the minimal number of baselines that together completely cover the set. The baselines may or may not overlap. A baseline consists of all remaining candidates of a row/col/box/cell-constraint.
- The elimination set should be minimal, i.e. all candidates of the set have to be necessary parts
- The pattern class PT(n) consists of all elimination sets with T = n
- The combined pattern classes of PT(n) with n <= N form are rule set that
is purely logical
hierarchical with N = 0, 1, 2, 3, ....
invariant for symmetry
obviously confluent because reduction of candidates never increases T

From analysing some hard sudoku I have hope that all known sudoku will be solved with N = 4 on the first iteration level.
The idea of finding any rule set that needs only one iteration level is challenging and in some way an extention of your thread theme.
For now I know of no solver that guarantees to find all patterns of the above defined classes. An implementation is not done on a day and is probably a project for the next winter time. PT(3) is covered by traditional methods, PT(4) is managable, PT(>4) hard if necessary.

logel

Posts: 55
Joined: 29 April 2012
Location: Germany

### Re: Pattern-based classification of (hard) puzzles

logel wrote:The braid rule system is very powerful, so I feel that there are not many other equally powerful rule sets to compare with.

And the associated whips are almost as powerful, in spite of their much nicer (continuous chain) structure. This is for me the main result.

logel wrote:The following defines pattern classes that make a rule set comforming to your requirements.
- Any elimination can be seen as a set of candidates divided into two subsets. One subset of min two candidates that are directly linked to one ore more targets, where at least one candidate of this set is true for all valid permutations. The other subset is the remaining candidates that form the connection logic. Note: This excludes patterns that rely on uniqueness. I regard them as not correct and unusable anyway.
- You can assign a number T to each elimination set, where T is the minimal number of baselines that together completely cover the set. The baselines may or may not overlap. A baseline consists of all remaining candidates of a row/col/box/cell-constraint.
- The elimination set should be minimal, i.e. all candidates of the set have to be necessary parts
- The pattern class PT(n) consists of all elimination sets with T = n

Your description is not very clear, but if I understand well what you have in mind, it can be interpreted in two ways, described in respective chapters 8 and 10 of Constraint Resolution Theories: as Subsets[n] or as gSubsets[n]. And it is obvious that both theories have the confluence property.
Subsets have a very weak resolution power, so let us forget them.
gSubsets have a stronger resolution power, but they face a very big problem: complexity. Moreover, a gSubset[2] can always be replaced by a g-whip[2], but the converse is false (so, even with small n, gSubsets don't seem to be a good deal against whips or g-whips).

If you had anything else in mind, please give more detail and forget the following.

There are two ways of dealing with gSubsets in Sudoku:

- the generic way: consider all the combinations of n different CSP variables (or 2D cells if you prefer this terminology) [C(324, n) possibilities] and all the combinations of n different g-transversal sets (or transversal covers if you prefer) [multiply by C(324, n) other possibilities]. This number can somehow be reduced by excluding a few obviously incompatible pairs of combinations, but it remains huge, even for n = 4. So huge that I gave up the idea of implementing it in SudoRules.

- the Fish way: try to classify all the possible cases of generalised Fish (because this is what a gSubset is) in order to maximally reduce the above number of combinations. Then you must forget no case. The concrete problem is, tens of people have been playing with this for years in the Ultimate Fish threads but I've not yet seen anything "ultimate" coming out of it.

logel wrote:From analysing some hard sudoku I have hope that all known sudoku will be solved with N = 4 on the first iteration level

If by "on the first iteration level" you mean without T&E, I think this is hopeless with gSubsets. As always in Sudoku, the problem is, something seems to be true in "almost all the cases" and we make conjectures about it, but there will eventually appear a rare or extremely rare case that will refute them.

logel wrote:For now I know of no solver that guarantees to find all patterns of the above defined classes.

The best way I know of testing your conjecture is with Hodoku. AFAIK, it is said to implement all the known Fish rules. By de-activating anything else (except basic interactions), you should be able to find counter-examples.

logel wrote:PT(4) is managable, PT(>4) hard if necessary.

If you succeed in doing this with reasonable computation times for n=4 or n=5, even if gS5 is not enough to solve all the puzzles, I think this will be a noticeable achievement. My above estimation of complexity is just a rough one; you should try to improve it after reviewing more precisely the combinations you really need to take into account.
denis_berthier
2010 Supporter

Posts: 1258
Joined: 19 June 2007
Location: Paris

### Re: Pattern-based classification of (hard) puzzles

denis_berthier wrote:
logel wrote:- Any (unique) sudoku is related to a set of derived sudoku with one more given replacing one candidate.
- All these derived sudoku can be solved until a contradiction, if
a) candidate is FALSE finally
b) the rule set is powerful enough
- This is true the the same way for any later knowledge state.

Your condition b on the set of resolution rules makes your assertions tautological.

I was not precise enough. The rule set must be powerful enough without using further recursions. This is theoretically also possible with the original sudoku and a rule set like PT(n) for n large enough (see previous post). You could even ignore intermediate elimination steps. This is of course not practical because the solutions will be extremely complex. This only shows that solving alone is not the point, but solving in a "simple" way.
The first sudoku addict were naive in suggesting that a handful of static "methods" would solve any sudoku. Step by step these methods were extended to a seafood collection without a chance for being complete. The next step was the invention of patterns with variable size, simple chains and finally your braids. But this introduces another dimension of simplicty, that one of pattern size and of construction principle simplicty. Most of the annoying and destructive forum discussions stem from the lack of a definitions of someones simplicity concept.
My choice is cleary the simplicity of the solution (i.e the complete solution path) measured by uniform basic units calculated from the result only. This rating should be completely independent from any methods or rules. These are related to the finding process/algorithm, which is a different thing. Your braids rely on a simple construction principle, but this does not automatically lead to a simplicity definition of solution pathes. I do not citicize that, its just another road.

denis_berthier wrote:
logel wrote:Eliminations and full solution pathes are made of the same elements.

This is your main error, and it explains why you get blocked : if solution paths of the derived sudokus use the rules in R, then the corresponding eliminations of the original sudoku have to rely on R-braids, i.e. on much more complex patterns.

NO. It was not meant that the elements are the same for a particular sudoku instance. They are - in principle - of the same kind. Solving any unique solvable sodoku and any non-solvable sudoku is done - in principle - by the same tools i.e. methodes/rules. So rating of a "derived sodoku" (equivalent to a braid) and rating of full solutions pathes must be calculated - in priciple - by the same algorithm. Anything else would be inconsistent. This is a key consideration.

I got stuck at a different point. The main idea is to compare solutions (of the same sudoku) by the accumulated effort of all eliminations on the solution path. So the solution with the least effort will be the winner and defines the rating. The exact minimum is probably very hard to find in most cases.

Its relatively easy to find cost functions that work for logic networks that eliminates exactly one candidate. For an example you can take a braid (singles only) and count the number of link interactions or logical implications. Then take the next one, ...
This is fine as long as all the logical networks are independent from each other. But if the elimination logic for some candidates overlaps (e.g. see ronks example) the effort calculation of the combined elimations invite trouble, because each elimation reuses the logic, but each in a different way. And even worse later eliminations may reuse the network partially. There are other examples with even more problems. The first impulse is to tax the reuse cost with zero, but this leads to paradox results because smaller and smaller logical fragments are reused again and again. Therefore reuse cost must be somehow a function parameter, but there is no natural way to do that, at least I cant see one. Reuse is related to memorizing specific intermediate logical states, but there is no way to express memory usage in units of link interactions or logical implications.

So I was unable to find a cost function for elimations AND an addition definition that takes account of the redundancy of overlaps and/or reuse.

logel

Posts: 55
Joined: 29 April 2012
Location: Germany

### Re: Pattern-based classification of (hard) puzzles

Hi logel

OK, you're still trying to find a rating of the global path. But then, I can only repeat what I said some time ago. I have absolutely no idea on how to tackle this.

How would you rate puzzles that can be solved by Singles and ECP? They require lots of basic rule applications, all of same basic complexity, but not the same number for different puzzles. And, for each of these, different paths may have very different lengths.
So, probably, your first step should be how to find the shortest path in this case.

One more thing is unclear in what you write: what do you want to rate, the path itself or the algorithm that exhibits it?
denis_berthier
2010 Supporter

Posts: 1258
Joined: 19 June 2007
Location: Paris

### Re: Pattern-based classification of (hard) puzzles

denis_berthier wrote:Your description is not very clear, but if I understand ....... as Subsets[n] or as gSubsets[n]. And it is obvious that both theories have the confluence property.

I try again. What I explained is nothing new, but just a special view on each and every elimination. All eliminations correspond to a set of candidates, where the elimination target (not contained in the set) is in direct link contradiction for all valid true-false permutations of the candidate values. If the elimination set is minimal (i.e. all members are necessary) it is called pattern and there is a number T denoting the maximum number of true nodes in the pattern. You could also say that T is the capacity of true nodes of the pattern. The pattern class PT(n) consists of all pattern with T < n. PT(n) has confluency.

Its simply ALL possible eliminations, there could be no different kind of subsets.
Elimination sets also work if they are not minimal, but Sudoku rules and methods always should generate minimal pattern.

denis_berthier wrote:- the generic way: consider all the combinations of n different CSP variables (or 2D cells if you prefer this terminology) [C(324, n) possibilities] and all the combinations of n different g-transversal sets (or transversal covers if you prefer) [multiply by C(324, n) other possibilities]. This number can somehow be reduced by excluding a few obviously incompatible pairs of combinations, but it remains huge, even for n = 4. So huge that I gave up the idea of implementing it in SudoRules.

I was stupid enough to try something like this some years ago. All bottom-up approches lead into a complexity jungle. The "Ultimate Fish" dictionary is not extensible for T = 4,5,6,... and hard to prove to be complete.

denis_berthier wrote:If by "on the first iteration level" you mean without T&E, I think this is hopeless with gSubsets. As always in Sudoku, the problem is, something seems to be true in "almost all the cases" and we make conjectures about it, but there will eventually appear a rare or extremely rare case that will refute them.

"First iteration level" mean T&E(1) of course. Or: All "derived sudoku" (see above) can be checked unsolvable using PT(4) patterns only. I agree to your statement about conjectures, so I avoided this term. There is no way to prove such conjectures other than trying all possible sudoku, which is unsatifying. So its just an observation.

The implementation idea is different. Its possible to produce PT(1),PT(2),PT(3) completely. There is no need to generate and try all eliminations of PT(n), if one only wants to know the solvability without getting a resulting solution path. This is sufficient for classification. As mentioned earier any pattern besides singles can be seen as a substitution of a recursion level. Or the other way round: Instead of using a pattern you could use a fraction of the next recursion(s).
So after finishing T&E(1) with PT(<=3) and no result, I intend to use a modified T&E(2) that restrict the iterative loop at n=4 and also walks into all possible branches of the tree. There are some PT(4) patterns that cannot be produced by combining PT(<=3) pattern. But these are only the full jelly-fish pattern, no big deal to add them. This looks more promising.
This process will not deliver any solution path but ensures that a path with PT(4) exists. If PT(4) is not sufficient, the same plan works with higher n too.
Will take some time ........

denis_berthier wrote:OK, you're still trying to find a rating of the global path.

I am not willing to accept that there is no meaningful way to compare solution pathes.
I mean "the path itself" and only using properties of the full path. I regard rating of the algorithms even more hopeless.

logel

Posts: 55
Joined: 29 April 2012
Location: Germany

### Re: Pattern-based classification of (hard) puzzles

Hi logel,

You are still evading the simplest question: how would you rate the full path of puzzles that can be solved by Singles and ECP?
If you can't answer this, all the rest is hopeless.
denis_berthier
2010 Supporter

Posts: 1258
Joined: 19 June 2007
Location: Paris

### Re: Pattern-based classification of (hard) puzzles

denis_berthier wrote:Hi logel,

You are still evading the simplest question: how would you rate the full path of puzzles that can be solved by Singles and ECP?
If you can't answer this, all the rest is hopeless.

Evading??? Your litte provocation activated some energy to reconsider all half-baked ideas about full path rating. It was clear from the beginning that a radical different view on sudoku is required for such an attempt. I got stuck at trying to rate the path by combinining elimination ratings along the path. So I scraped that and started new.

--------- Current idea:

I named it the temperature model of sudoku. It works for all sudoku solutions based on constraint propagation.

Start with the common view on sudoku having 9*9*9 candidates. Some of them are the given numbers, all the rest need more or less logic to prove them true or false, provided the sudoku is unique.
Each candidate is associated with a value that is defined by the minimal number of constraints forming a network proving the candidate true or false. Although this number is not directly determinable it exists always and has a clear semantic meaning. Note that each candidate is treated separately, not caring about common parts of the logical networks. Candidates that are easy elimination targets get a relatively low number and others further down the elination path will get higher numbers.

The next step is to arithmetically add all candidate values. The resulting sum is called the absolute temperature of the sudoku. You can always calculate easyly an upper bound of the absolute temperature from a full solution path by tracing back the tree of elimination triggers. This way you get the relative temperature of a specific solution path that is always higher than the absolute temperature.

With the same method you can calculate a (relative) temperature of any knowlegde state of the solution path. The only difference is that you set all already eliminated candidates to the value zero before. The idea is to forget about the history of the state. Of course there is also an (unknown) absolute temperature of the knowledge state. It is easy to see that the temperature must decrease with each elimation. The finally solved sodoku has the temperature zero. It now becomes clear that the conception of a cooling process is the motivation of the naming.

The temperature model has three major benefits:
1) The absolute temperature reflects an aspect of the complexity of the sudoku and serves as rating value to compare different sodoku. It should (hopefully) match somehow the subjective rating of severity.
2) The relative temperature reflects the quality of the solution and compares solution pathes of the same sudoku. Any lower relative temperature makes the solution "better".
3) The temperatures of the knowledge states are used to order the elimations. The optimal order is by decreasing temperature deltas. The cooling should proceed as quickly as possible. Note that this order of eliminations is totally disconnected from any rule hierarchy, but reflects only the maximal effect of each elimination.

Caveat:
This is currently only a paper concept, but very promising. There are no results, no comparisons to other ratings.
Another problem is that full solutions are rarely published and pulling solution pathes out of existing solvers is laborious.
I will implement the model when I find time.

logel

Posts: 55
Joined: 29 April 2012
Location: Germany

### Re: Pattern-based classification of (hard) puzzles

logel wrote:Each candidate is associated with a value that is defined by the minimal number of constraints forming a network proving the candidate true or false. Although this number is not directly determinable it exists always and has a clear semantic meaning.

Sorry, i did not follow this thread. But what do you mean here ? Is there a indirectly calculable number determining how hard it is to eliminate a candidate, based on contradiction net constraints ?
If so, i guess, that the definitions must be very restrictive, much more than the experts like, who try to classify the hardest for years now.
eleven

Posts: 1816
Joined: 10 February 2008

### Re: Pattern-based classification of (hard) puzzles

logel wrote:
denis_berthier wrote:Hi logel,
You are still evading the simplest question: how would you rate the full path of puzzles that can be solved by Singles and ECP?
If you can't answer this, all the rest is hopeless.

Evading??? Your litte provocation activated some energy to reconsider all half-baked ideas about full path rating. It was clear from the beginning that a radical different view on sudoku is required for such an attempt. I got stuck at trying to rate the path by combinining elimination ratings along the path. So I scraped that and started new.

OK, but your new ideas will face the same problem and my provocation still holds.
Notice I appreciate that you are trying to get off the beaten path. Most of the discussions on forums are so boring. But, generally speaking, when introducing new ideas, they should first be tested on the simplest cases.

logel wrote:... the temperature model of sudoku. It works for all sudoku solutions based on constraint propagation.

There are many ways to use constraint propagation. Some are pattern-based (as in most of what is discussed on this forum), some are not (DFS, BFS, T&E, colouring, tagging, dancing links, general purpose CSP-solvers, ...)

logel wrote:Start with the common view on sudoku having 9*9*9 candidates. Some of them are the given numbers, all the rest need more or less logic to prove them true or false, provided the sudoku is unique.
Each candidate is associated with a value that is defined by the minimal number of constraints forming a network proving the candidate true or false. Although this number is not directly determinable it exists always and has a clear semantic meaning. Note that each candidate is treated separately, not caring about common parts of the logical networks. Candidates that are easy elimination targets get a relatively low number and others further down the elination path will get higher numbers.

I have no formal proof (which would anyway be difficult with your informal, vague definitions), but I can try to make you understand why this premise is false. Your error may be due to a bad interpretation of the confluence property (you spoke of this property in a previous mail). In my view, this property is so important that I've spent many pages in my last book ("Constraint Resolution Theories") to prove it in detail for all the generalised braid theories I've introduced. But it should not be interpreted beyond what it really means.

What the confluence property says, when a resolution theory T has it, is: given a puzzle P, in any resolution path for P within this theory, the hardest step for P will always have the same T-rating. This is essential, as it allows to find the rating and the simplest solution of P (according to this T-rating) after a single path has been generated within T, using the simplest-first strategy.
What it does NOT say is: the T-rating of P can be assigned to some well defined candidate. Indeed, in different resolution paths, the hardest step may be due to different candidates. This is not an abstract view; I've observed it in many cases.

So, even if you consider the universal BB (or B7B) rating, which is undoubtedly a consistent interpretation of the "minimal number of constraints" necessary for the hardest step, you can't use it to assign a rating to each candidate.

The problem with your tentative definition is, some candidates must be eliminated before others. At the start, there is no "network" attached to each candidate proving that it is true or false, independently of the presence of other candidates.

Now, you can try to replace braid theories with Subset or g-Subset theories; I have shown that they also have the confluence property. But you will face the same problem, plus a problem of greater complexity (much higher rating) for most of the puzzles.

logel wrote:Caveat:
...Another problem is that full solutions are rarely published and pulling solution pathes out of existing solvers is laborious.

There is a very clear reason for this: most of what is now published displays gorgeous graphics for the steps they want to show and it carefully hides the hideous steps.
As for me, I always publish the full resolution paths. You may use those I've posted on this forum.
denis_berthier
2010 Supporter

Posts: 1258
Joined: 19 June 2007
Location: Paris

### Re: Pattern-based classification of (hard) puzzles

denis_berthier wrote:I have no formal proof (which would anyway be difficult with your informal, vague definitions), but I can try to make you understand why this premise is false. Your error may be due to a bad interpretation of the confluence property (you spoke of this property in a previous mail).

There is nothing vague about the definition. A false candidate is in contradiction to all contraints always - being the worst case. But usually there are many smaller constraint sets proving the candidate false, and consequently there is a minimal one. The same applies to true candidates. This investigation refers to each candidate separately, so its irrelevant that some candidates may be eliminated before others. The minmal network always includes the accumulated dependencies back to the knowledge state you want to take as the starting point.

I know perfectly what confluence means, but in this context it has no place. Confluence is a property of rule sets and each rule set generates a subset of all possible elimination patterns. This subset is always smaller than the set of all possible elimination patterns. Your braid rule sets generate very large elimination pattern sets, but nevertheless they are incomplete. A real full path rating must accept any elimation pattern. There is no unique back reference from an elimination pattern to the rules they stem from. I intend to use only properties that belong to the actual patterns of the solution path. Rules belong to the finding or classification process and live in their own dimension.

The tentative character of my definition is indeed the weak point. It was a shot into the dark. So I dont like to speculate any further until I can present definite results.
From what I know now the above definition is too simple and leads to disadvantages I dont like, but I am confident that the main direction is OK.

logel

Posts: 55
Joined: 29 April 2012
Location: Germany

### Re: Pattern-based classification of (hard) puzzles

This winter I found some time for sudoku programming. So I implemented some ideas that were on my desk for months already.

I can contribute a classification ( NOT rating ) to this thread complying to the terms given on the first page.

Its meant for very hard puzzles only. The majority of random generated puzzles will fall into the first two classes defined below.

Motivation:
Starting from T&E(S), T&E(BI), T&E(S+BI+Pair), ... one can see that there are lots of hard puzzles that are not resolved on level 1, but need T&E(S) at layer 2. For all puzzles I know T&E(S) seems to be sufficient on level 2.
To prevent misunderstanding: direct application of the rule set defines T&E level 0, and on level 1 one extra candidate is assumed to be true.

Especially the T&E at level 2 is very hard to interpret in terms of CSP variables. This is mainly because the level 2 uses the knowledge states from level 1, but there are no specific connections to the eliminations done on level 1.
Observing the control flow on level 2 one can see lots of redundant calculations until a useful continuation is found. This also results in an excessively increasing computational effort.

So I had the idea to power up T&E-1 until all known puzzles are solvable on level 1. This was finally successful.
You made some comment, that constructing gSubsets[n] leads to madness - and I agree to that. But the idea to use only gSubsets that are restricted to planes finally produced a new solver implementing these sets. A plane can be a number, row, column or box with one fixed coordinate. Planes or Plane sets can be block checked for eliminations "relatively" easy by contradiction.

The data basis I used is the "collection of hardest puzzles" assembled by champagne. The current list dated from 10/2012 contains 56448 puzzles with SER rating of 10.5 or more.

Results:
Class T&E(S): no solutions

Class T&E(S+BI): no solution, so the attribute "hard" is justified for all in the list

Class T&E(S+BI+Pair+T5): solving 17159, remaining 39289
T5 is defined as: all Patterns with exactly 5 CSP variables
Because of the basic construction of sudoku, T5 is always a single number pattern with two rows, two colums and a box link.

Class T&E(S+BI+P1[NBRC]): solving all but 10623 puzzles
P1 is defined by all patterns restricted to a single plane of the sudoku grid.
P1[N1] e.g. is the plane of numbers equal 1 containing all fish patterns with number 1
P1[N] denotes all fish patterns of all single number planes ( triples, quads, and all the sea food )
P1[Bx] consequently denotes all patterns restricted to a single box, P1[R], P1[C] likewise for rows and columns

Although this class is very powerful, its clearly not enough.

Class T&E(S+BI+P1[NBRC]+P2[NBRC]) : remaining 317 unsolved puzzles, mainly residing in the top 3000 of the list
P2[1,2] e.g. contains all patterns restricted to the combined planes P[1] and P[2].
P2[N] denotes all multi-fish patterns in all the 36 pairings of two number planes.
That means patterns of this class all work in pairs of CSP planes, that do not overlap.

Some experiments with overlapping planes (e.g. SingleNumber + Row) showed little effect or none, contrary to the huge computational effort. So I dropped these.

Of cause I hoped to solve all with this class, but reality is stronger.

Class T&E(like 4 + P3[N])
P3[N] containes all patterns living in one of the 84 number plane triples.
Note that P2 and P3 patterns can not only eliminate candidates inside the defining planes, but also in the CSP lines that connect the planes.

The construction of the remaining P3[BRC] plane sets was only needed for 5 puzzles.
There are more choices of non-overlapping plane triples

Conclusion: The level 1 attack is finally successful, but pays its price. No magic for hard puzzles.
There is no specific reason I did not define more classes in between the chosen definitions. Especially with P3 there re much more options.
I only wanted to see, if this approach is possible and computation can be done within reasonable time.

Each choice of plane sets can define a class with confluent eliminations. So the final result is always independent of the elimination order.

Computing resources needed for solving one puzzle: ( 2MHz CPU-32, single thread )
class T&E(S+BI+P1[NBRC]): ~ 200 msec
class T&E(S+BI+P1[NBRC]+P2[NBRC]): ~ 5-8 sec
class T&E(like 4 + P3[N]): ~ 25 sec memory < 1 GB
There is still room for optimization and more details to work out.

logel

Posts: 55
Joined: 29 April 2012
Location: Germany

### Re: Pattern-based classification of (hard) puzzles

Hi logel,

As I have checked recently, my old T&E(2) conjecture [i.e. all the puzzles are in T&E(0) or T&E(1) or T&E(2)] remains valid for all the new puzzles recently found; moreover, my stronger conjecture that all the puzzles are in B7B also remains valid, which provides a possible answer to your question of finding X such that all the puzzles are in T&E(X), i.e. can be solved by only one level of T&E: X=B7; but, of course, this leaves the possibility of finding other X's.

T&E(0) puzzles are obvious ones.
T&E(1) ones are completely classified by the B rating.

As for the very small remaining T&E(2) proportion (about 1 in 50,000,000), there are many possible pure-logic and symmetry-respecting sub-classifications based on resolution theories with the confluence property:

- the SpB classification, depending on the maximum size p of subsets allowed as right-linking objects; as I showed long ago, some puzzles fall outside this classification; eleven generated a full set of such cases (several millions), a small part of which he made available; whether replacing Subsets by gSubsets would lead to a full coverage of all the puzzles (and with which maximum size of gSubsets) remains an open question;

- the BpB classification, depending on the maximum length p of braids allowed as right-linking objects; as mentioned above, all the known puzzles are in at most B7B; this is equivalent to being in T&E(B7);

- another possible classification could rely on the B*p-braids introduced in PBCS;

- one could also use the maximum size of bi-braids needed to have a B*-braid[1] solution.

The main problem with sub-classifications of T&E(2) is what exactly we are looking for. That's why I introduced in PBCS the notion of a pattern of proof. From your post, I conclude you are looking for a proof of the solution based on X-braids, where X is one of the classes of patterns you define. So, let me first make a few remarks and ask a few questions to clarify some of your definitions.

First of all, the expression "all patterns" and its variants ("all patterns with 5 CSP variables", "all patterns restricted to a single plane [or number]") is not well defined. When you write:
logel wrote: P1 is defined by all patterns restricted to a single plane of the sudoku grid.
P1[N1] e.g. is the plane of numbers equal 1 containing all fish patterns with number 1
,
this is somehow contradictory. We don't know if we should read the first or the second line, i.e. whether P1[N] is the set of standard Fish (2nd line) or the set of g-fish [Franken and Mutant Fish] (first line).

Secondly, the number of "CSP planes" (of types n, r, c, b) cannot be used to measure the size of a pattern (you don't say so, but it may be implicitly understood in some places). You must count the number of CSP variables (of types rc, rn, cn, bn). Thus, a fish on number 1 doesn't have size 1 but 2, 3 or 4.
This will be important when you need to measure the size of the patterns you use in T&E(X).

logel wrote:Class T&E(S+BI+Pair+T5): solving 17159, remaining 39289
T5 is defined as: all Patterns with exactly 5 CSP variables
Because of the basic construction of sudoku, T5 is always a single number pattern with two rows, two colums and a box link.

Allowing patterns with 1, 2 or 5 but not with 3 or 4 CSP variables seems quite arbitrary.
Moreover, I think "S+BI+Pair+T5" doesn't have the confluence property - and therefore this class is not well defined. When a pattern P built on 5 CSP variables is destroyed by some eliminations before being applied, what remains may have fewer variables; as it is not in your class, it doesn't allow the elimination allowed by P.
However, this doesn't have any impact on the sequel.

logel wrote:Class T&E(S+BI+P1[NBRC]): solving all but 10623 puzzles
P1 is defined by all patterns restricted to a single plane of the sudoku grid.
P1[N1] e.g. is the plane of numbers equal 1 containing all fish patterns with number 1
P1[N] denotes all fish patterns of all single number planes ( triples, quads, and all the sea food )
P1[Bx] consequently denotes all patterns restricted to a single box, P1[R], P1[C] likewise for rows and columns
Although this class is very powerful, its clearly not enough.

By P1[NBRC], do you mean P1[N]+P1[ B]+P1[R]+P1[C] ?
If such is the case, this is merely S=S4, i.e. all the (Naked, Hidden or Super-Hidden) Subsets and your "Class T&E(S+BI+P1[NBRC])" is merely T&E(S). Its elements can be solved by Sp-braids, p <=4.
Or should we adopt a reading with g-Fish? But then, do you stop at size 4 ?

logel wrote:Class T&E(S+BI+P1[NBRC]+P2[NBRC]) : remaining 317 unsolved puzzles, mainly residing in the top 3000 of the list
P2[1,2] e.g. contains all patterns restricted to the combined planes P[1] and P[2].
P2[N] denotes all multi-fish patterns in all the 36 pairings of two number planes.
That means patterns of this class all work in pairs of CSP planes, that do not overlap.

This seems to confirm my above reading: this should be written as T&E(P1[N]+P1[ B]+P1[R]+P1[C]+P2[N]+P2[ B]+P2[R]+P2[C]).
Can you confirm?
This class should solve all the puzzles classified by Champagne as multi fish. Can you confirm?
Can you say more about the (new ???) multi-row, multi-column and multi-block patterns?
Can you say more about the maximum size of the patterns (e.g. multi-fish) needed to reach this result ("only 317 unsolved")?

logel wrote:Class T&E(like 4 + P3[N])
P3[N] contains all patterns living in one of the 84 number plane triples.
Note that P2 and P3 patterns can not only eliminate candidates inside the defining planes, but also in the CSP lines that connect the planes.
The construction of the remaining P3[BRC] plane sets was only needed for 5 puzzles.

It would be interesting to give these 5 puzzles explicitly.

logel wrote:Conclusion: The level 1 attack is finally successful, but pays its price. No magic for hard puzzles

You seem to have interesting results. It'd be worth to clarify the few points I mentioned above.
One important point for me is the maximum size of the patterns involved in your results (i.e. the number of CSP variables they rely on)
denis_berthier
2010 Supporter

Posts: 1258
Joined: 19 June 2007
Location: Paris

PreviousNext