The New Sudoku Players' Forum

by **logel** » Sat Apr 20, 2013 9:38 pm

denis_berthier wrote:
logel wrote:An Universal Elimination Pattern is a set of candidates along with a set of links connecting the candidates according to basic sudoku constraints, where
(1) exactly one designated candidate is the elimination target ( single target )
(2) the target value is false for all valid permutations of candidate values ( elimination )
(3) all candidates and all links in the set are absolutely necessary for (2) ( minimal set )
(4) there exists no subset that is a separate universal elimination pattern ( no cascading )

(4): subset of what? Of candidates, of links or (more properly) of CSP variables?
(3) and (4) are somehow redundant
(3): "all links" is impossible to guarantee (see the NP example below)

Its certainly not redundant, but I had some problem to find the right wording. Meanwhile I edited a correction (see above).
Without (4) you face the following problem: Assume any two eliminations where the second depends on the result of the first one. Without (4) nothing prevents to unify both and declare the union as elimination of the second target. The condition (3) holds because if you remove any of the candidate, the pattern does not work. I want to exclude this kind of cascaded eliminations and I am fine with any other wording you might propose.

denis_berthier wrote:
logel wrote:In many cases where lines in the set do overlap some of the lines (not their candidates) are dispensable.

This contradicts your point (3) above.

See the example in my reply to champagne. It make more clear what is meant with dispensable lines.

denis_berthier wrote:
logel wrote:This implies: Any (universal) elimination pattern is fully determined by enumerating its base set.
This implies also that the link structure (net or queue) and other properties depend on the base set.
In other words: The base set is defining the elimination, but stop … wait …

This requires the base set to be unique for a particular elimination pattern. That would be really great. But a proof seems not so easy. Ideas welcome.
So I put the uniqueness of the base set of an elimination pattern as a conjecture here.
The conjecture is true for all practical cases, but I have some doubts.

The conjecture is trivially false. Take a Naked Pairs based on two cells in a mini-row. You can consider it both as a Pairs in a row or as Pairs in a block. In the 3rd cell in the mini-row, any NP elimination can be done by any of the two different patterns.

Your example is confusing to me. First I am not familiar with the term mini-row, but this may not be important.
Naked Pairs have always two cells as base lines each containing the two candidates of the associated cell. The universal elimination pattern contains the four candidates plus one dedicated target. This remains true even in the case that the core candidates happen to share the same row and box(block). The base set is unique.
If you meant a hidden pair instead the situation is slightly different. You can name the two lines that make up the base set either row or box line. But a line is defined as set, and the identity of sets is absolutely defined by their members, not by their label. So the base set is also unique. Lines are not bound to CSP variables or two a super-set they are made from. I always suspected that there is a subtle difference between my line and your CSP variable.
If there is any counter example for base set uniqueness is must be very large.
Note that I defined the target to be part of the elimination set. One reason is connected to the base set uniqueness. Any pattern that fits in a Dirichlet matrix (rank-0) has of course two line sets that cover the core after all eliminations are completed. The core network still exists but without target its not a universal elimination any more.

denis_berthier wrote:
logel wrote:I will use the number of lines in the base set as pattern size. Its also equal to the maximal number of true candidates in the pattern core. This is not my favourite size function because it has some disadvantages. But it is easy computable and easy comparable with other results. Discussion of alternatives would overload this post and is saved for later.

See my suggestion for this definition of size in the "Pattern-based classification of (hard) puzzles" thread. This is the definition I use for all "my" patterns: whips, braids, extended whips or braids, Subsets, g-Subsets. So, at least, we have compatibility on this important point.

Yes, bus there is a decisive difference. Lets name the assumed candidate of a level-one sequence start candidate. As my eliminations are candidate sets, I can build the union of all eliminations that appear in that level-one sequence. This large set certainly contains all objects necessary for the elimination of the start candidate, but usually many more. Now its possible to reduce this set to a minimal configuration in a similar way as I reduce planes. The result is a universal elimination pattern with the start candidate as target. I can enumerate the base set and know the size. So the size propagates from eliminations to aggregated eliminations, and the nice part of this is that it works for any size function defined on properties of a universal elimination. The aggregated elimination is of course far from "optimal", but at least I have one.

denis_berthier wrote:My overall conclusion: you should re-formulate your definition with the base set as the primary data.

I will think about this, but its not instantly obvious that it makes anything easier. The base line set is a good descriptor of an elimination, please note that I do not use any link sets like Allan Barker concept. Links are there in the form of common knowledge and used when needed. The view on the pattern as candidate set is also required, because permutations operate on candidate values, not base lines. I am open to any formulation as long as the intention is preserved.

by **eleven** » Sat Apr 20, 2013 9:42 pm

The daily definition

Red Ed wrote:A pattern is a set of holes (eliminated candidates) together with a target (candidate eliminated by those holes).

You eliminate candidates with eliminated candidates ?

logel wrote:But because I see absolutely no reason to modify the problem statement, I reject using uniqueness methods in the solving sequence.

If you like.

I know that other people have a different view, but I never heard a sound reasoning.

Here is my point of view:
Sudoku in first line is a game, which is played with the goal to solve it. The play is unfair, when the puzzle is not unique, because no solution can be found without guessing. When you had spent 5 hours trying to solve a multisolution puzzle, you know, why players insist on the uniqueness of puzzles as part of the "problem statement".
(On the other hand i had fun to solve some puzzles, which were declared to be multisolution puzzles, but this i a different game. )
Since your universal method seems to be non guessing, it is useless for non unique puzzles too, isn't it ?
For a player the goal is neither to analyze the puzzle nor to proof a solution (though both is required), but to find a solution, and in best case an elegant one. For this goal uniqueness methods can be the salt in the soup.

by **Red Ed** » Sat Apr 20, 2013 9:50 pm

eleven wrote:You eliminate candidates with eliminated candidates ?

Yes. Suppose, for example, that there's no 1 in r12c123. Those six holes give rise to six eliminations, namely r3c456789.

by **eleven** » Sat Apr 20, 2013 10:08 pm

ok, we would say, 1 is restricted to r3c123 and this gives the eliminations.
But r3c4 is not an eliminated candidate, so it can't be a hole anyway.
Ah, forget it,i looked at the wrong cell, ...
No, the pattern obviously is not a full swordfish.
I just don't know, what that has to do with logel's concept.

by **eleven** » Sat Apr 20, 2013 10:43 pm

logel,
please give us a better example.
E.g. can you show your first step to solve "Champagne dry" and explain pattern, lines etc. there ?

Code: Select all: 98.7.....7.....6....6.5.....4...5.3...79..5......2...1..85..9......1...4.....3.2.

by **logel** » Sat Apr 20, 2013 10:52 pm

denis_berthier wrote:
logel wrote:Next steps to do:
The level-one sequences need to be reduced to universal elimination patterns on the base level. They contain lots of redundant stuff usually. Only reduced patterns can be sized properly.
More research is needed to understand the relation between level-one and base level pattern size.

Theoretically, P6-braids and the simplest-first strategy will do the job. But P6-braids are very complex animals.

I am well aware about the scary complexity of P6, P7, P8, ...
But whatever you exactly might mean by simplest-first strategy, I am very suspicious about. What counts in the end is the aggregated complexity of a level-one sequence reduced to a minimal elimination.
Therefore I am free to use any Px-animal. The difference in computational effort to use P6 or P9 is meaningless with my approach. That is different to any bottom-up construction plan for pattern.

denis_berthier wrote:If you are consistent in your approach, you should apply your universal pattern without any T&E, just by increasing the size of the base set. Maybe, then, you'll meet the complexity problem.

This is probably not the direction I would like to go, because - as you suppose - it might end in a combinatorial nightmare. I usually like to finish computations within my life time.

My final motivation is to find a reasonable size/complexity function for the aggregated complexity of all elimination pattern of a full solution path, built consistently from its parts. (we discussed that some time ago)
This would assign a quality value to a solution path and make paths comparable. Of course I know that one can never reach the "optimal" solution, only approximations.
By now the ideas are only fragments and do not seem to fit in a complete picture. So I hope for some inspiration through the discussions here.

by **logel** » Sun Apr 21, 2013 12:52 am

denis_berthier wrote:Defining the size of a pattern is one thing. Analysing its relation with complexity is another one.

I know two main broad families of patterns where size can be defined and easily related to complexity:
- whips, braids, g-whips, g-braids, S-whips, ... B-braids: I defined their size as their total length (including the length of the inner patterns, if any);
- Subsets and g-Subsets: their size can trivially be defined as the number of CSP variables they rely on (the cardinal of their "base set").
In the many cases where a set of candidates can be seen as either of these patterns, the two definitions coincide.
The main point is, for both kinds of patterns, checking the validity of any instance in a particular resolution state (PM) is straightforward: they carry their own proof with their definition.
As a result, their complexity is easily related to their size.

In your approach, proving the validity of a pattern once you have written it requires that the reader tries all the truth value assignments for all the candidates in the pattern; this is exponential in the number of candidates. This is a degree of inner complexity that is added to that inherent in size.

I do not agree. Even if the definition relies on permutations, you don't have to check all value assignments. You prove instead that the inverse is false, means that once the target has a value true no valid permutation can exist, because you run into a contradiction. This is of almost similar complexity as checking validity of what you call a g-Subset. The validation effort depends mainly on the number of inevitable branches, the rest grows linear with the number of candidates.

denis_berthier wrote:What I mean is, one can indeed define very general types of patterns. The question is how one can use them.
This brings us back to your introduction.
You were right to start with a discussion of motivations.
But, if having an understandable proof of the solution is the motivation - as it is generally for Sudoku players -, one will not want to use patterns such that each step of the proof (each elimination) requires long additional computations (check all the "permutations") to be proven. This is the advantage of having a properly restricted set of patterns such that seeing one of them provides a direct proof of its eliminations.

Understandable proof is OK. But I am more concerned about the quality of the steps and finally of the step sequence. And quality should be expressed in some quality values assigned to the step and the step sequence. Of course quality and complexity are closely coupled - one could even regard it as the same thing.

Website · by **denis_berthier** » Sun Apr 21, 2013 5:05 am

Hi logel,
I don't have much time this morning. Just a remark on your modified definitions.

logel wrote:An Universal Elimination Pattern is a set of candidates along with a set of links connecting the candidates according to basic sudoku constraints, where
(1) exactly one designated candidate is the elimination target ( single target )
(2) the target value is false for all valid permutations of candidate values ( elimination )
(3) all candidates and all links in the set are absolutely necessary for (2) ( minimal set )
(4) there exists no subset of the defining candidate set that is a separate universal elimination pattern ( no cascading )corrected

logel wrote:Most important is the following statement:The pattern core of a universal elimination pattern is equal to the union of all lines of the pattern core. (discarded)
The pattern core of a universal elimination pattern is equal to the union of all those lines, where all line candidates are also members of the pattern core.corrected
That says that all candidates of the pattern core must be also be member of some line that is fully covered by the pattern core.edited
Consequently the pattern core can be relabelled as a set of lines. This is easy to prove.

One can now wonder why you don't include from the start the core (the base set) in the definition instead of using such a convoluted path to reach the same effect (hopefully - because I'm not convinced that your statement, even in its modified form, is true).
In any case, I think your current definition makes it too difficult to check if something is an elimination pattern.

About vocabulary: any of your patterns is a very specific structure. So, the real topic of the thread is not "Universal Elimination Pattern" but "Universal Definition of an Elimination Pattern".

More remarks later.

by **Red Ed** » Sun Apr 21, 2013 7:13 am

eleven wrote:ok, we would say, 1 is restricted to r3c123 and this gives the eliminations.

That's what a human solver would say. However, when enumerating truths/constraints exhaustively (which is my preferred theoretical approach at the moment), it is the presence of holes in those truths/constraints that cause further eliminations to be made.

I just don't know, what that has to do with logel's concept.

Perhaps relatively little, which is why I qualified it "fwiw", but we seem to be still on the subject of definitions so I'm happy mine is out there for consideration if anyone is interested.

by **eleven** » Sun Apr 21, 2013 6:21 pm

Yes, it's interesting (but maybe the wrong thread).
So both a (fundamental) pair and x-wing have level 14, triple and swordfish 18, a quad 20 and an xy-wing 21 ?
And non fundamental patterns (e.g. because of givens) are "easier" ?

by **Red Ed** » Sun Apr 21, 2013 7:11 pm

By "truth/constraint", I mean one of the 324 axioms of the form "there's exactly one X in Y" (e.g. one 3 in box 4). I think I might have seen these referred to as bases somewhere. Anyway: pairs & x-wings would each be at level 2, swordfish at level 3.

The level is not intended to be a measure of difficulty.

by **eleven** » Sun Apr 21, 2013 7:48 pm

I see, so not that interesting

by **Red Ed** » Sun Apr 21, 2013 8:24 pm

by **blue** » Sun Apr 21, 2013 11:26 pm

Red Ed wrote:By "truth/constraint", I mean one of the 324 axioms of the form "there's exactly one X in Y" (e.g. one 3 in box 4). I think I might have seen these referred to as bases somewhere.

FWIW, this is the way I've always seen things:

1) There are statements like: "there must be at least one X in Y" (e.g. at least one 3 in box 4)
2) There are statements like: "there can be no more than one X in Y" (e.g. no more than one 3 in box 4)
3) There are statements like: "there must be exactly one X in Y" (e.g. exactly one 3 in box 4)

Statements of the first type, are related to "base sets", or "strongly linked (candidate) sets", or "strong links" (for short).
Statements of the second type, are related to "cover sets", or "pairwise weakly linked (candidate) sets", or "weak links" (for short). In Alan Barker's vocabulary, they are just "links", or "link sets".
Statements of the third type, are related to either "CSP variables" (e.g. Denis Berthier's vocabulary), or "Truths" (Alan Barker's vocabulary).

Trivia: When both the 1st and 2nd statements must be true, for the same set of candidates, the combined statement (the "logical and" of the two statements") is equivalent to a statement of the 3rd form (for that same set of candidates).

When someone uses the term "Sudoku constraint", for me at least, there's always the question: "what kind of constraint -- 'at least one of', 'no more than one of', or 'exactly one of' ?". It seems that the term is used in to mean different things at different times.

For myself, I like to use terms like weak link constraint(s), strong link constraint, or "exactly one of" constraint, so that there's no ambiguity.

This is just me (again), but to be honest, I suppose I prefer the word "requirement", rather than "constraint", when it's an "at least one of" statement that's implied. I like to reserve the word "constraint" for cases where a "no more than one of" statement is (at least) implied. When it's both -- which is a rare situation for me -- I like to throw an "exactly one of" in front of "constraint". So, for example, I might talk about "satisfying strong link requirements, subjet to weak link constraints", or "satisfying base (sector) requirements, subjet to cover (sector) constraints". Or, I might say that a (normal) DLX solver, deals with "exactly one of" constraints.

Website · by **denis_berthier** » Mon Apr 22, 2013 6:45 am

Hi Blue,
Nice to see you here again.
I think you raise a good question, but it must be taken the other way round: start from the problem and, only after that, see what kinds of logical relations it involves.

The problem is naturally formulated in terms of:
1) 81 rc-cells and of values (digits) that must be assigned to them;
2) 324 constraints the values of these cells must satisfy; these can be classified into 4 constraint types (rc, rn, cn, bn).

In HLS, I showed that it is better to reformulate the problem in terms of 324 2D-cells (of 4 types: rc, rn, cn and bn) represented by the extended Sudoku board. I keep using this vocabulary of 2D-cells when I concentrate on the Sudoku domain (and on games based on a similar grid structure), although I have introduced (in CRT and PBCS) the name CSP-variable when I wanted to extend my approach to any CSP in which no grid may be available. The correspondence is straightforward: the extended Sudoku Board is a graphical representation of the contents of the 324 2D-cells / CSP-variables.

I'd now say that there is a curse of Sudoku: every constraint arises from a CSP-variable (from the extended set), whence the confusion (and all the inconsistent writings about weak and strong links). That's one of the reasons why I started to be interested in other logic games. One can easily find examples in which there are constraints that do not arise from CSP-variables, e.g. N-Queens - or Kakuro, if anyone also reads the discussions on this topic.

Now, translated into logic formulae:
- type 1 is expressed in terms of EOO (exactly one of);
- type 2 is expressed in terms of NAND (not and).
The other type of logical relation you consider ("at most one of") never appears (neither in Sudoku nor in the general CSP).

One important aspect of these predicates:
- type 1 appears as non-binary (most of the time);
- type 2 always appears as binary.

I haven't forgotten the topic of this thread.
The fact that type 1 is (generally) not binary has concrete implications on complexity:
- Sudoku is not a 2-SAT problem and can't therefore be solved by the very efficient 2-SAT solvers.
- for the same reasons, logel means of checking validity of an "elimination pattern" is also not a 2-SAT problem (even if the target is fixed to True). The complexity of checking can't easily be computed in the general case, whereas it can for patterns such as Subsets, g-Subsets, whips and all the oriented chains fauna - which I express by saying that these special (but very general) patterns carry their own proof with them. In my view, a "pattern" that requires a non-obvious checking procedure (testing all the permutations - or all the remaining ones after the target has been fixed to True) does not really qualify as a pattern in any "pure logic" view of solving.

However, I found logel's results interesting when he first spoke of them in the "Pattern-based classification of (hard) puzzles" thread (http://forum.enjoysudoku.com/pattern-based-classification-of-hard-puzzles-t30493.html). Contrary to many people, he didn't come with only vacuous definitions, he had done something concrete with them. In the present thread, he has presented only part of his results, but more are available there.

Finally, I would in no case use the word requirement here. Requirements appear at another level:
- there is generally a requirement that the solution be obtained by "pure logic" or in a "rule-based" or "pattern-based" way (whatever one may mean by this);
- contrary to eleven, I don't think that uniqueness is a constraint of Sudoku (in the above defined sense of constraint). It is not even a requirement for the player (just read any newspaper: uniqueness is never ever mentioned). Uniqueness is a requirement for the puzzle creator, not for the player (indeed the player has no means of ensuring uniqueness if the puzzle has multiple solutions); whether a player chooses to use techniques based on uniqueness is a matter of personal choice and of trust in his puzzle provider; so that a discussion about uniqueness in this thread would be totally OOT.

The New Sudoku Players' Forum

Universal Elimination Pattern for hard puzzles

Re: Cont: Universal Elimination Pattern

Re: Universal Elimination Pattern for hard puzzles

Re: Universal Elimination Pattern for hard puzzles

Re: Universal Elimination Pattern for hard puzzles

Re: Universal Elimination Pattern for hard puzzles

Re: Cont: Permutation based Solving

Re: Complexity of the Universal Elimination Pattern

Universal Definition of an Elimination Pattern

Re: Universal Elimination Pattern for hard puzzles

Re: Universal Elimination Pattern for hard puzzles

Re: Universal Elimination Pattern for hard puzzles

Re: Universal Elimination Pattern for hard puzzles

Re: Universal Elimination Pattern for hard puzzles

Re: Universal Elimination Pattern for hard puzzles

CSP variable vs constraint