Pattern-based classification of (hard) puzzles

Advanced methods and approaches for solving Sudoku puzzles

Re: B?B classification of puzzles with "nothing special"

Postby denis_berthier » Sat Mar 31, 2012 6:23 pm

champagne wrote:The SKloop is a very simple pattern. Few lines can describe it.

It's just that we don't have the same notion of what a definition is. For you, a vague example is good enough.
I don't have the precise reference, but I remember seeing variations of the SK-loop in which the optional candidates indicated in my above-mentioned thread were effectively missing. When I wrote these definitions, no such example was available and nobody had spoken of optional candidates in the pattern.

champagne wrote:Definitely, this thread is not for me.

We already know that theory or even rigour is not your cup of tea. It's your merit to be aware of it.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: B?B classification of puzzles with "nothing special"

Postby ronk » Mon Apr 02, 2012 11:49 am

denis_berthier wrote:
ronk wrote:You seem to have fun writing "precise definitions", so feel free to write one for an exotic pattern of your choice.
http://forum.enjoysudoku.com/x2y2-belts-t5894.html

I was thinking more along the lines of something new, for example, this exotic which you might call an x3y3-belt or maybe even x1y1-belt. Note there is only one link horizontally and vertically between boxes b5689, but three links within each of these boxes.

As for me, I'll call it an sk-loop, a variant within the spirit of that rule.

Code: Select all
98.7.....7.6...8...54......6..8..3......9..2......4..1.3.6..7......5..9......1..4 # 7658;GP;H1521


____Image

Code: Select all
     16 Truths = {5689N47 47N5689}
     16 Links = {7r4 8r7 3c4 6c7 125b5 459b6 249b8 125b9}
     16 Eliminations --> r4c23<>7, r5c69<>5, r7c13<>8, r8c69<>2, r23c4<>3, r69c5<>2, r13c7<>6,
     r69c8<>5
ronk
2012 Supporter
 
Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Re: Pattern-based classification of (hard) puzzles

Postby denis_berthier » Mon Apr 02, 2012 5:35 pm

ronk wrote:
Code: Select all
98.7.....7.6...8...54......6..8..3......9..2......4..1.3.6..7......5..9......1..4 # 7658;GP;H1521

Sure, this would be worth displaying in a way that better shows its beautiful symmetries.

Code: Select all
+-------+-------+-------+
| 9 8 . | 7 . . | . . . |
| 7 . 6 | . . . | 8 . . |
| . 5 4 | . . . | . . . |
+-------+-------+-------+
| 6 . . | 8 . . | 3 . . |
| . . . | . 9 . | . 2 . |
| . . . | . . 4 | . . 1 |
+-------+-------+-------+
| . 3 . | 6 . . | 7 . . |
| . . . | . 5 . | . 9 . |
| . . . | . . 1 | . . 4 |
+-------+-------+-------+ SER=10.8


In relation to this thread, the most interesting point is, this puzzle is in B3B and it remains in B3B even after the 16 eliminations you mention (i.e. r4c23<>7, r5c69<>5, r7c13<>8, r8c69<>2, r23c4<>3, r69c5<>2, r13c7<>6, r69c8<>5)
Do you have any example of the same type, with SER > 11.6 or, on the contrary, with still smaller SER ?


Needless to say, x2y2-chains do not apply to this puzzle.

ronk wrote:As for me, I'll call it an sk-loop, a variant within the spirit of that rule.

These are two contradictory claims: you can call it either an SK-loop or a variant of an SK-loop, but not both.

My computer knows nought of spirits or souls or ghosts - be it that of a rule.
The same could be said of any formal theory:
- either you already have a precise definition and (hopefully) you can prove something about it,
- or you have only vague ideas about a supposed pattern and you can only choose to look for a clear definition or to merely fantasise about it.


It happens that a formal definition D (say x2y2-belts) can be modified into one (D') applying to objects (such as those that your example intends to exemplify) that do not satisfy D but are in some way similar to those satisfying D.

In some cases, D and D' can be generalised to a third definiton D'' superseding both.
The question is, does D'' keep anything of the "spirit" of D?
In the present case, invoking for D'' the general informal xsudo approach as you do (and I'd say the same of my formal theory of Subsets or g-Subsets), can only dilute the original pattern in something so general that the SK-loop "spirit" is completely lost.

The question is, is there any pattern, close to or slightly more general than x2y2-chains, that would apply to this example? And my answer is, as of now I don't know.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: Pattern-based classification of (hard) puzzles

Postby ronk » Mon Apr 02, 2012 9:30 pm

denis_berthier wrote:
ronk wrote:As for me, I'll call it an sk-loop, a variant within the spirit of that rule.
My computer knows nought of spirits or souls or ghosts - be it that of a rule.
...
These are two contradictory claims: you can call it either an SK-loop or a variant of an SK-loop, but not both.

Yes, I already knew your computer program knows only what you tell it.

As to variants, think of it this way: The first could be Variant I, the next could be Variant II, etc.

denis_berthier wrote:The same could be said of any formal theory:
- either you already have a precise definition and (hopefully) you can prove something about it,
- or you have only vague ideas about a supposed pattern and you can only choose to look for a clear definition or to merely fantasise about it.

There is nothing vague about my illustration. If you included suitable illustrations with your definitions, perhaps more of us would understand your writings or even finish the readings.
ronk
2012 Supporter
 
Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Re: Pattern-based classification of (hard) puzzles

Postby denis_berthier » Tue Apr 03, 2012 2:48 am

ronk wrote:As to variants, think of it this way: The first could be Variant I, the next could be Variant II, etc.

... Variant3456432228I9, Variant345643222820 ....
How many variants would we have to consider before we can say what an SK-loop is?

For me, there is the SK-loop, formalised by an x2y2-chain. And there may be variants, as yet undefined, e.g. what seems to be in the present case franken or mutant SK-loops (in which some of the row or column constraints are replaced by block constraints), ....
As in the fish case, the franken/mutant case is much more complex than the standard case.

ronk wrote:There is nothing vague about my illustration.

What's vague is not your illustration, but what it intends to represent.
With a single example, there may be many ways to formalise it. See my second post in the above-mentioned thread (http://forum.enjoysudoku.com/x2y2-belts-t5894.html), especially questions Q2 and Q3.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: Pattern-based classification of (hard) puzzles

Postby ronk » Tue Apr 03, 2012 4:01 am

denis_berthier wrote:
ronk wrote:As to variants, think of it this way: The first could be Variant I, the next could be Variant II, etc.

... Variant3456432228I9, Variant345643222820 ....
How many variants would we have to consider before we can say what an SK-loop is?

Wow, a handful becomes 345 billion! Oh I see, when presented with something reasonable, counter with something ridiculous. IIRC that was frowned upon in my debate class.

you wrote:For me, there is the SK-loop, formalised by an x2y2-chain. And there may be variants, as yet undefined, e.g. what seems to be in the present case franken or mutant SK-loops (in which some of the row or column constraints are replaced by block constraints), ....
...
See my second post in the above-mentioned thread (http://forum.enjoysudoku.com/x2y2-belts-t5894.html), especially questions Q2 and Q3.

My Variant II above is no more franken or mutant than your Variant I ...
... and relative to your Q2, it also has three candidates in each of the sixteen cells.

As I said earlier, I will refer to my "Variant II" as an sk-loop. You will do as you will, I'm sure.
ronk
2012 Supporter
 
Posts: 4764
Joined: 02 November 2005
Location: Southeastern USA

Re: Pattern-based classification of (hard) puzzles

Postby denis_berthier » Tue Apr 03, 2012 4:32 am

ronk wrote:
you wrote:For me, there is the SK-loop, formalised by an x2y2-chain. And there may be variants, as yet undefined, e.g. what seems to be in the present case franken or mutant SK-loops (in which some of the row or column constraints are replaced by block constraints)

My Variant II above is no more franken or mutant than your Variant I ...

I wouldn't set a war on calling your variant franken or mutant or anything else, especially as I don't know yet what your example points to. Considering your reference to xsudo, do you mean that any xsudo structure based on 16 base sets and 16 links, all somehow related to 4 blocks forming a rectangle, should be called an SK-loop?


ronk wrote: and relative to your Q2, it also has three candidates in each of the sixteen cells.

EasterMonster has only 3 candidates per cell; but in x2y2-belts, there can be 2, 3 or 4 candidates in each of these cells. It is the role of the optional candidates in the x2y2-chain pattern to take account of all these possibilities. How did I come to this formal pattern - more general than the examples we had at that time? By writing a tentative proof of the eliminations and seeing what was really needed.

In your variant, a similar analysis should be done in order to see if there must be exactly 3 candidates per cell or how this can be generalised.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: Pattern-based classification of (hard) puzzles

Postby denis_berthier » Thu Apr 05, 2012 6:46 am

denis_berthier wrote:
ronk wrote:
Code: Select all
98.7.....7.6...8...54......6..8..3......9..2......4..1.3.6..7......5..9......1..4 # 7658;GP;H1521

In relation to this thread, the most interesting point is, this puzzle is in B3B and it remains in B3B even after the 16 eliminations you mention (i.e. r4c23<>7, r5c69<>5, r7c13<>8, r8c69<>2, r23c4<>3, r69c5<>2, r13c7<>6, r69c8<>5)


Actually, I've found two more interesting facts:
- B2-braids allow the 16 eliminations mentioned by Ronk and only these eliminations.
- S2-braids allow the 16 eliminations mentioned by Ronk and only these eliminations.

Remember that:
- B2-braids (resp. S2-braids) are generalised braids accepting Whips[2] (resp. Subsest[2]) as their right-linking elements;
- as Subsets[2] are subsumed by Whips[2], S2-braids are subsumed by B2-braids and the second result above is stronger than the first.


This explains why the 16 eliminations don't change the B?B classification of this puzzle.
This also shows that Ronk's pattern is indeed very different from an SK-loop. B2-braids (and a fortiori S2-braids) allow no elimination in Easter Monster.

Finally, as the puzzle is in B3B, i.e. it is solvable by B3-braids, one could wonder if it is solvable by S3-braids; but it isn't.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

An example of an S2-braid

Postby denis_berthier » Sat Apr 07, 2012 5:58 am


An example of an S2-braid


As Ronk's puzzle discussed above has no braid or g-braid elimination at the start, it provides a very convenient example of an S2-braid (the immediately next step of complexity in my hierarchy).

The starting PM is as follows:

Code: Select all
+-------------------------+-------------------------+-------------------------+
|9       8       123      |7       12346   2356     |12456   13456   2356     |
|7       12      6        |123459  1234    2359     |8       1345    2359     |
|123     5       4        |1239    12368   23689    |1269    1367    23679    |
+-------------------------+-------------------------+-------------------------+
|6       12479   12579    |8       127     257      |3       457     579      |
|13458   147     13578    |135     9       3567     |456     2       5678     |
|2358    279     235789   |235     2367    4        |569     5678    1        |
+-------------------------+-------------------------+-------------------------+
|12458   3       12589    |6       248     289      |7       158     258      |
|1248    12467    1278    |234     5       2378     |126     9       2368     |
|258     2679    25789    |239     2378    1        |256     3568    4        |
+-------------------------+-------------------------+-------------------------+


At this point, one has the following S2-braid[14]:

S2-braid[14]: b9n3{r9c8 r8c9} - b9n6{r8c9 r789c7} - b9n8{r8c9 r7c789} - {n8r7c5 NP: b8{r7c5 r8c4}{n2 n4}} - r7c6{n2 n9} - r9c4{n9 n3} - {n3r5c4 HP: b5{r5c6 r6c5}{n3 n6}} - b5n7{r5c6 r4c456} - r4c8{n7 n4} - r5c7{n4 n5} - r4c9{n5 n9} - r6c7{n9 .} ==> r9c8 <> 5

with NP = Naked Pairs, HP = Hidden Pairs

Let me now give some explanatory detail. For this, I usually add some marks in the braid:
- for ease of reference, I put a number between brackets after each right-linking candidate or pattern in the braid;
- I use independent numberings for different branches of the braid, each of them starting after the number from which it branches out; here, there is only one main branch (1, 2, ... 11) plus a small secondary branch (2'); this S2-braid is almost an S2-whip.
- I add z- and t-candidates, with symbol "z" for a z-candidate and with symbol "#n" for a t-candidate (with n = the number of the previous right-linking candidate or pattern to which it is linked).

S2-braid[14]: b9n3{r9c8 r8c9(1)} - b9n6{r8c9 r9c8z r789c7(2')} - b9n8{r8c9 r9c8z r7c789(2)} - {n8r7c5 n3r8c4#1 NP: b8{r7c5 r8c4}{n2 n4}(3)} - r7c6{n2 n8#2 n9(4)} - r9c4{n9 n2#3 n3(5)} - {n3r5c4 n3r6c4#5 HP: b5{r5c6 r6c5}{n3 n6}(6)} - b5n7{r5c6 r6c5#6 r4c456(7)} - r4c8{n7 n5z n4(8)} - r5c7{n4 n6#2' n5(9)} - r4c9{n5 n7#7 n9(10)} - r6c7{n9 n5#9 n6#2' .} ==> r9c8 <> 5

Notice that Sp-braids may include g-candidates as right-linking objects. They appear here in cells (2'), (2) and (7).

Notice that, in my definition of braids with embedded patterns, no notion of any almost-pattern (almost pair, ...) is needed: in the context of the braid, they are the patterns themselves (here NP and HP). And a candidate is linked to an inner pattern (either as a next left-linking candidate or as a t-candidate), if it is a target of this pattern, in the usual sense. Here:
- left-linking candidate n2r7c6 in cell 4 is linked to the previous NP (3)
- left-linking candidates n7r5c6 in cell 7 is linked to the previous HP 6)
- t-candidate n7r6c5 in cell 7 is linked to the previous HP (6)
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: B-braids and S-braids resolution power

Postby denis_berthier » Fri Apr 13, 2012 11:29 am


Compared resolution power of B-braids and S-braids


In this old post http://forum.enjoysudoku.com/abominable-trial-and-error-and-lovely-braids-t6390-43.html, I asked: what is the "simplest" family FP of elementary patterns such that all the puzzles are solvable by FP-whips or FP-braids?
What the present thread shows (for all the known puzzles) is, FP=braid[7].
What couldn't be achieved at that time by S-braids is achieved by B-braids with relatively short inner braids.

In the same post, I also gave the number of puzzles in gsf's famous list of 8152 that can be solved by S-braids.

(At that time, I used a more complicated terminology; but the correspondence is easy:
old new
braids[BI] = g-braids
braids[BI+Subsets] = S-braids)


Let me now complete this with the B-braids results I hadn't computed at that time. As Subsets are limited to size 4, in the following table I've limited the inner braids also to size 4.
The presentation is the same as before: gsf's list is cut into slices of 500 puzzles.
Only the first 6000 are interesting for the comparison: the next ones are already solved by braids or g-braids.

Code: Select all
Nb of gsf puzzles solved by the different types of braids, in each slice:

Braids used..........braids............g-braids..................S4-braids/B4-braids
1-500................0.................187.......................443 / 482
501-1000.............0.................178.......................460 / 496
1001-1500............0.................163.......................486 / 500
1501-2000............0.................168.......................490 / 500
2001-2500............0.................135.......................474 / 500
2501-3000............0.................116.......................479 / 500
3001-3500............1.................120.......................473 / 500
3501-4000............0.................113.......................472 / 500
4001-4500............1.................104.......................448 / 500
4501-5000............0.................231.......................482 / 500
5001-5500............47................348.......................500 / 500
5501-6000............434...............490.......................500 / 500
6001-6500............487...............500
6501-7000............494...............500
7001-7500............500
7501-8000............500
8001-8152............152



As mentioned in previous posts, the few puzzles not solved by B4-braids can be solved by B5, B6 or B7 braids.
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: Pattern-based classification of (hard) puzzles

Postby logel » Mon Apr 30, 2012 10:16 pm

Hello Denis

Although my Sudoku activity has almost died, I feel tempted to comment your new thread.

(A) Classification
I can understand very well, that one is bored on "yet another rating/classification".
The limits of the existing ones are known. They depend on heuristic parameters, do not scale and reflect the capabilities of special solvers. Every classification complying your terms is a lot of work to implement. For me its absolutely unclear what kind of insight 10 more classifications should bring. I do not expect them to converge to a final classification. And for "statistics" there exists no reference sample of unbiased problems and never will. Because there can be no method to prove a sample to be unbiased.

(B) Simplicity
Everybody looks for more "simple" solutions pathes, but obviously there is no common understanding what is simple. A precise definition (here it is again) would be a good challenge.

champagne wrote:As I wrote in the hardest puzzle thread, a rating tool putting in the short list some puzzles having one "easy" solution would have a low credibility.

Personal opinions about what is "easy" are fine but are not good for arguments and do not lead anywhere.

denis_berthier wrote:SER is the only rating among these 5 that captures the logical complexity of some of the hardest sudokus. The other 4 ratings completely fail on this point.


To make such a statement you seem to know the logical complexity. If you can calculate the complexity I would like to know how you do it, otherwise the above comment on personal opinions apply.


(C) Semantics of rating
Rating is an attempt to approximate the complexity of a sudoku problem. But there are two completely different kinds of complexity.
a) the minimal complexity of finding a solution path
b) the minimal complexity of verifying a solution path

When looking at the discussions, it seems that (a) is meant and expected allways. But (a) is a property of the solving algorithm. A hard rating of complexity (a) would require that ALL possible solving algorithms cannot solve the problem beyond a certain number of steps. There is no hope to calculate such complexity.
In contrary the complexity (b) is nearer to earth, because it depends on the solution pathes only. As far as I can see, the ratings use properties of the solution path, mainly what is regarded as the most complicated step necessary to solve. Although complicated solution pathes probably also require a more complicated finding process, it is is generally incorrect to assume (a) == (b) or even (a) = fkt(b).
Using only a fraction of the solution path is questionable for solutions of "hard" problems with many complicated solution steps. This may be a reason for "failed" ratings shown in this thread.

(D) T&E layer
I can see also three layers. (1) is OK. I would restrict (2) only for those that can be solved with eliminations of the following type. A candidate is verified to be false by assuming the candidate true and applying subsequent single eliminations. This is repeated until a contradiction is detected or no more candidates to eliminate are found. I do not like the term T&E, because lead to misunderstandings for many people. There is maybe try - but definitely no error. I use to call it A&C for assume and check.
My experimental results from a year ago are confiming yours. A very large number of problems fall into this class. Trying to expand the class by introducing more methods like BoxLine, Pairs or Triples into the contradiction path has surprisingly very limited effect.
Pro: It has the effect of reducing the lenght of the contradiction path to half size in most cases.
Contra: It blurres the difference between (2) and (3)

Looking closely at the remaining cases of class (3) one finds that solving requires branched contradiction logic. The critical network contains at least two three-way connections that block progression of the type (2) procedure. Or in other words, all remaining candidates live in networks of three (or more) coupled loops. I found also that a next recursion level is unnecessary. It is sufficient to branch at constraints that are touched by the steps before and are down to a two-way branch and then apply singles to both branches. I tried various toplists from a year ago and determined no exceptions.
Anyway there is no substancial difference between recursion and branching, so I dont understand many peoples animosity against recursive methods. The main deficit of recursive methods is that they usually lead to awfully long and multiple branched contradiction pathes.
User avatar
logel
 
Posts: 55
Joined: 29 April 2012
Location: Germany

Re: Pattern-based classification of (hard) puzzles

Postby denis_berthier » Tue May 01, 2012 3:01 am

Hi logel,

Many comments in a single post. Thanks for your interest. I'll try to answer.

logel wrote:(A) Classification
Every classification complying your terms is a lot of work to implement.

Yep, but I've implemented them. On my website and in another thread (http://forum.enjoysudoku.com/rating-rules-puzzles-ordering-the-rules-t5995.html), very long ago, I have even given very detailed (unbiased) results for the W and B classifications.
Even before implementation, it's also a lot of more theoretical work: you have to define correctly relevant families of rules and to prove that they have the confluence property.

logel wrote:For me its absolutely unclear what kind of insight 10 more classifications should bring. I do not expect them to converge to a final classification.

This is what I've always said: there can't be any unique classification. Any classification has to rest on a predefined family of well defined resolution rules (or on some more or less obscure procedure, or on some abstract "algorithmic complexity" definition, but these would not be compatible with pattern-based resolution and therefore not "final" for Sudoku addicts).
Each family of resolution rules carries its own classification. Classification results obtained with it give an idea of the resolution power of this family and of each of its elements.
In my view, a given classification is also a scale against which other rules can be measured.
Unfortunately, for many families of rules (e.g. AICs), no study of their resolution power has ever been conducted; as a result, their discussion has been mainly about examples and personal opinions on reversibility, ...

In addition to this, although there is no unique rating, my approach leads (at least me) to a clear understanding of the basic principles on which any rating should be built.
One of the principles I haven't mentioned above - because it is more a common point of all my other rating definitions (W, B, gW, gB, S, SB, BB, ...) than an a priori on rating systems - is, the rating of a rule should be based on the number of CSP-variables (of 2D cells, if you prefer) necessary to formulate it.

logel wrote:And for "statistics" there exists no reference sample of unbiased problems and never will. Because there can be no method to prove a sample to be unbiased.

You are (almost) wrong on this point. I've defined a controlled-bias generator (see my website or here: http://forum.enjoysudoku.com/the-real-distribution-of-minimal-puzzles-t30127.html). A first, quick implementation was done by Eleven as a simple modification of suexg and then it has been progressively improved by (direct or indirect) contributions from many people: Eleven, Paul Isaacsson, gsf, Brian Turner... I've run it on several PCs in parallel (the equivalent of several months of CPU time) and I obtained a collection of ~ 7,000,000 independent puzzles, with known bias. For statistics, "with known bias" is almost as good as "unbiased", provided you use the proper correction formulae.
All these puzzles are in T&E(1).
What's true is, there's no known unbiased (or with known bias) collection for the hardest (in the broad sense of: in T&E(2)). Probably the best approach to this is Eleven's collection of 13,000,000 (or is it 15,000,000?) T&E(2), though it is not proven to be unbiased. AFAIK, it isn't public (?)



logel wrote:(B) Simplicity
Everybody looks for more "simple" solutions pathes, but obviously there is no common understanding what is simple. A precise definition (here it is again) would be a good challenge.

This is just the second side of the previous question. There can't be a unique definition of simplicity.

logel wrote:
denis_berthier wrote:SER is the only rating among these 5 that captures the logical complexity of some of the hardest sudokus. The other 4 ratings completely fail on this point.

To make such a statement you seem to know the logical complexity. If you can calculate the complexity I would like to know how you do it, otherwise the above comment on personal opinions apply.

OK, you caught me. I meant the logical complexity defined by the B?B classification I introduced in this thread. I thought it was clear from the context.



logel wrote:(C) Semantics of rating
Rating is an attempt to approximate the complexity of a sudoku problem. But there are two completely different kinds of complexity.
a) the minimal complexity of finding a solution path
b) the minimal complexity of verifying a solution path

When looking at the discussions, it seems that (a) is meant and expected allways. But (a) is a property of the solving algorithm. A hard rating of complexity (a) would require that ALL possible solving algorithms cannot solve the problem beyond a certain number of steps. There is no hope to calculate such complexity.
In contrary the complexity (b) is nearer to earth, because it depends on the solution pathes only. As far as I can see, the ratings use properties of the solution path, mainly what is regarded as the most complicated step necessary to solve. Although complicated solution pathes probably also require a more complicated finding process, it is is generally incorrect to assume (a) == (b) or even (a) = fkt(b).

It's true that rating has always been the rating of solving (unless I'm missing something, verifying a resolution path is obvious) and it has always been the rating of the hardest step (mainly because nobody knows how to do otherwise).
However, in my approach, it is NOT a property of an algorithm. Although it is explicitly dependent on a family of resolution rules, it is purely logical. And, when it is satisfied by the family of rules, the confluence property guarantees that it does not depend on which particular resolution path using these rules combined with a simplest-first strategy is chosen; otherwise, all the paths would have to be tried.
Even my 3-level T&E classification, apparently based on the T&E procedure, is indeed purely logical, as it corresponds to being solvable respectively by Singles, braids or B-braids.

Every puzzle has an intrinsic rating (possibly infinite) for each family of rules. What's noticeable is, most of the time, these ratings (W, B, gW, gB, W+S, ...) are equal.

logel wrote:Using only a fraction of the solution path is questionable for solutions of "hard" problems with many complicated solution steps. This may be a reason for "failed" ratings shown in this thread.

Clearly, no, it can't be the reason. All the ratings considered in this comparison are based on the same "hardest step" approach.



logel wrote:(D) T&E layer
Trying to expand the class by introducing more methods like BoxLine, Pairs or Triples into the contradiction path has surprisingly very limited effect.

T&E(S), T&E(S+BI), T&E(S+BI+Pairs), T&E(S+BI+Pairs+ Triplets), ... are increasingly more powerful (as shown by my old classification results in this thread: http://forum.enjoysudoku.com/abominable-trial-and-error-and-lovely-braids-t6390.html)
But it's true that it has little effect (statistically) on the rating of puzzles in T&E(1) as shown by the other old results in this thread: http://forum.enjoysudoku.com/rating-rules-puzzles-ordering-the-rules-t5995.html

logel wrote:Looking closely at the remaining cases of class (3) one finds that solving requires branched contradiction logic. The critical network contains at least two three-way connections that block progression of the type (2) procedure. Or in other words, all remaining candidates live in networks of three (or more) coupled loops. I found also that a next recursion level is unnecessary. It is sufficient to branch at constraints that are touched by the steps before and are down to a two-way branch and then apply singles to both branches. I tried various toplists from a year ago and determined no exceptions.
Anyway there is no substancial difference between recursion and branching, so I dont understand many peoples animosity against recursive methods. The main deficit of recursive methods is that they usually lead to awfully long and multiple branched contradiction pathes.

Being in T&E(2) is equivalent to being solvable by B-braids, which means that contradictions arising from the truth of two candidates have to be considered instead of only contradictions arising from a single candidate - whichever method one uses for this.
Some people try to hide a recursion level either by tagging everything or by extracting from xsudo a few more or less readable steps, displaying them with all the bling of grandiose graphics and hiding all the other illegible steps.

I don't consider recursion and branching as being exactly the same thing:
- recursion can only be defined in a procedural way (even so, my conjecture, still verified, that all the puzzles are in one of the 3 T&E classes means that only one level of recursion is necessary).
- various types of branching can be defined in a pure logic way.
Showing that some type of recursion is "equivalent" to some type of branching (i.e. that the result of some procedure can also be the conclusion of an associated resolution rule, e.g. that a T&E(2) solution can always be replaced by B-braid logic solution) may be a lot of work (a very strange thing for a 1st of May!)
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: Pattern-based classification of (hard) puzzles

Postby logel » Tue May 08, 2012 11:06 pm

denis_berthier wrote:
logel wrote:For me its absolutely unclear what kind of insight 10 more classifications should bring. I do not expect them to converge to a final classification.

This is what I've always said: there can't be any unique classification. Any classification has to rest on a predefined family of well defined resolution rules (or on some more or less obscure procedure, or on some abstract "algorithmic complexity" definition, but these would not be compatible with pattern-based resolution and therefore not "final" for Sudoku addicts).
Each family of resolution rules carries its own classification. Classification results obtained with it give an idea of the resolution power of this family and of each of its elements.
In my view, a given classification is also a scale against which other rules can be measured.

This was not exactly my point. Its obvious that different classifications yield different result. But what then?
I can can see no good concept to compare them besides some "statistics" on more or less biased samples. That looks not very attracting, at least for me.

denis_berthier wrote:
logel wrote:(C) Semantics of rating
Rating is an attempt to approximate the complexity of a sudoku problem. But there are two completely different kinds of complexity.
a) the minimal complexity of finding a solution path
b) the minimal complexity of verifying a solution path
.....

It's true that rating has always been the rating of solving (unless I'm missing something, verifying a resolution path is obvious) and it has always been the rating of the hardest step (mainly because nobody knows how to do otherwise).
However, in my approach, it is NOT a property of an algorithm. Although it is explicitly dependent on a family of resolution rules, it is purely logical.

Yes, the hardest step is of course not a direct property of the algorithm. Its a property of the resulting solution path. You cannot deduce the complexity of solving/finding from a result property in the general case. E.g. the number of prime factors has absolutely no relation the the complexity of finding them. So if there is a relation it needs some more tough arguments for the sudoku case.

denis_berthier wrote:Every puzzle has an intrinsic rating (possibly infinite) for each family of rules. What's noticeable is, most of the time, these ratings (W, B, gW, gB, W+S, ...) are equal.

As fas as I can see the goes back to the fact that these ratings share the same basic idea. They count the maximum number of true nodes in the patterns, which is a very natural approach.

Altough I have great sympathy with your systematic and theory based concept, rule sets reflect only one dimension of the sudoku universe. You started with nrct-chains and arrived after some extension-extensions at what you call braids. A common part of these concepts is "apply simple rules first". This seems also natural. There had been a discussion on "order of rules" in another thread with seemingly no result. You escape this problem because the braids are ordered by an intrinsic hierarchy. But this hierarchy implies an a-priori concept of simplicity and make the results uncomparable with other results.

Comments frequently mix simplicity of rules and solutions, if simplicity is an argument. Its a common misunderstanding in the sudoku community that simple rules always lead to a simple solution path.
Even with very sloppy definitions on what is "simple" this is not the case. I try to give some arguments.
The level T&E(1) already includes queues of any length. On top of that there is no guarantee that the total number of steps is minimal. T&E just says that there exits a solution without a second recursion. The "simplest-rule-first" approach pays the price of sometimes long and complicated solution pathes. The simplest applicable rule makes no "strategic" eliminations that enables further steps. Its like shooting on easy targets and look whats happening. You often see many useless eliminations that have no continuation.
Optimization of solutions is impossible without a target function. Any target function (or more of them) should not depend on rules, but rate the solutions from their primary properties. This obviously is not the goal of your theory but nevertheless important. I made some unsuccessful attempts to define a comparison function, but if I find some time for sudoku this is what seems most attractive to me.
User avatar
logel
 
Posts: 55
Joined: 29 April 2012
Location: Germany

Re: Pattern-based classification of (hard) puzzles

Postby denis_berthier » Wed May 09, 2012 4:27 am

logel wrote:
denis_berthier wrote:
logel wrote:For me its absolutely unclear what kind of insight 10 more classifications should bring. I do not expect them to converge to a final classification.

This is what I've always said: there can't be any unique classification. Any classification has to rest on a predefined family of well defined resolution rules (or on some more or less obscure procedure, or on some abstract "algorithmic complexity" definition, but these would not be compatible with pattern-based resolution and therefore not "final" for Sudoku addicts).
Each family of resolution rules carries its own classification. Classification results obtained with it give an idea of the resolution power of this family and of each of its elements.
In my view, a given classification is also a scale against which other rules can be measured.

This was not exactly my point. Its obvious that different classifications yield different result. But what then?
I can can see no good concept to compare them besides some "statistics" on more or less biased samples. That looks not very attracting, at least for me.

It seems you haven't understood my previous answer about the controlled-bias generator: it does allow to get totally unbiased results.
As for comparisons that can only be based on statistics, one may like it or not, but this is unavoidable. All these ratings (as all the other known ratings) are meaningful only statistically. But statistics are based on perfectly sound mathematical concepts.

Now, if you want a unique rating, you can use the BB rating. It is finite for all the known puzzles and my T&E(2) conjecture says that it is finite for all the puzzles.
With my strongest B7B conjecture, you can even use the simplest B7B rating.
Personally, I prefer using several ratings.


logel wrote:
denis_berthier wrote:
logel wrote:(C) Semantics of rating
Rating is an attempt to approximate the complexity of a sudoku problem. But there are two completely different kinds of complexity.
a) the minimal complexity of finding a solution path
b) the minimal complexity of verifying a solution path
.....

It's true that rating has always been the rating of solving (unless I'm missing something, verifying a resolution path is obvious) and it has always been the rating of the hardest step (mainly because nobody knows how to do otherwise).
However, in my approach, it is NOT a property of an algorithm. Although it is explicitly dependent on a family of resolution rules, it is purely logical.

Yes, the hardest step is of course not a direct property of the algorithm. Its a property of the resulting solution path. You cannot deduce the complexity of solving/finding from a result property in the general case. E.g. the number of prime factors has absolutely no relation the the complexity of finding them. So if there is a relation it needs some more tough arguments for the sudoku case.

I can't see what you are looking for. AFAIK, the number of its prime factors has never been defined as a measure of complexity of a number.
If you are looking for an intrinsic measure of complexity of a puzzle, I can't follow you on this track; I think it is hopeless. As I said previously, complexity is tied to a set of resolution rules (or to a procedure, but it doesn't change the problem).


logel wrote:
denis_berthier wrote:Every puzzle has an intrinsic rating (possibly infinite) for each family of rules. What's noticeable is, most of the time, these ratings (W, B, gW, gB, W+S, ...) are equal.

As fas as I can see the goes back to the fact that these ratings share the same basic idea. They count the maximum number of true nodes in the patterns, which is a very natural approach.

Natural ways of measuring complexity, sure. But it isn't a priori obvious at all that they should give the same ratings most of the time. There is a large structural gap between whips and braids.


logel wrote:Altough I have great sympathy with your systematic and theory based concept, rule sets reflect only one dimension of the sudoku universe. You started with nrct-chains and arrived after some extension-extensions at what you call braids. A common part of these concepts is "apply simple rules first". This seems also natural. There had been a discussion on "order of rules" in another thread with seemingly no result. You escape this problem because the braids are ordered by an intrinsic hierarchy. But this hierarchy implies an a-priori concept of simplicity and make the results uncomparable with other results.

In and of themselves, the rules are independent of the simplest-first strategy. I wouldn't say I escape the problem of rule ordering; on the contrary, I solve it. The simplest-first strategy doesn't come out of a theoretical vacuum; it is allowed by the confluence property.
My concept of simplicity/complexity is defined a priori in relation to a family of rules, BUT it is relative to this family. I have no concept of absolute/intrinsic simplicity of a puzzle.
However, all this doesn't prevent any comparison with other sets of rules:
- with my general theory of Subsets and g-Subsets (which include all the Franken and Mutant Fish and all that xsudo would display as 0-rank patterns), it is very easy and natural to compare whips/braids with such patterns;
- with my theory of the sk-loop as an x2y2-chain, comparison is also obvious.
If you think a particular pattern couldn't be compared, let me know it and I bet I'll be able to show that it can.

You could make a comparison with SER (which relies on no theoretical principles at all). It is mainly based on a family of contradiction chains of various types (never defined but by their Java code). All the rules at lower levels (Subsets, uniqueness, etc.) are given arbitrary ratings and they have almost no impact on the rating of the hard puzzles. One could make similar comments on any solver able to solve any puzzle.

As for the theoretically a priori aspect of my ratings, it is supported by a posteriori statistics, showing that, in the mean, the number of partial chains that must be considered by the simplest-first strategy (which is undoubtedly an intuitive notion of complexity) increases exponentially with the rating.


logel wrote:Comments frequently mix simplicity of rules and solutions, if simplicity is an argument. Its a common misunderstanding in the sudoku community that simple rules always lead to a simple solution path.
Even with very sloppy definitions on what is "simple" this is not the case. I try to give some arguments.
The level T&E(1) already includes queues of any length. On top of that there is no guarantee that the total number of steps is minimal. T&E just says that there exits a solution without a second recursion. The "simplest-rule-first" approach pays the price of sometimes long and complicated solution pathes. The simplest applicable rule makes no "strategic" eliminations that enables further steps. Its like shooting on easy targets and look whats happening. You often see many useless eliminations that have no continuation.
Optimization of solutions is impossible without a target function. Any target function (or more of them) should not depend on rules, but rate the solutions from their primary properties. This obviously is not the goal of your theory but nevertheless important. I made some unsuccessful attempts to define a comparison function, but if I find some time for sudoku this is what seems most attractive to me.

Our goals are very different.
My goal is mainly to evaluate the scopes of different increasing families of rules. I've never tried to see how a player can do "better" (in any sense you want) than the simplest-first strategy.
Your goal of optimising the full resolution path is more ambitious (IMO unrealistic). Several people have already attempted this, without any success. I think you need some clear starting principles, if you don't want to come out with some arbitrary "target function" or to waste much time for nothing.
Unfortunately, I have strictly no idea on what these principles should be. Basically, I don't see how a path with a single complex pattern and a path with 4 or 5 slightly less complex ones can be compared. Do you have any idea about this?
denis_berthier
2010 Supporter
 
Posts: 1253
Joined: 19 June 2007
Location: Paris

Re: Pattern-based classification of (hard) puzzles

Postby logel » Wed May 16, 2012 10:53 pm

denis_berthier wrote:It's true that rating has always been the rating of solving (unless I'm missing something, verifying a resolution path is obvious) and it has always been the rating of the hardest step (mainly because nobody knows how to do otherwise).
However, in my approach, it is NOT a property of an algorithm. Although it is explicitly dependent on a family of resolution rules, it is purely logical.

I would translate to: Current rating is rating of eliminations in the first place. But each elimination consists only of a series of secondary eliminations on the next recursion level. These eliminations are usually not rated by the hardest step, but by the length of the backbone queue or similar. This seems paradox and inconsistent, although the hardest step is an important property of the Sudoku problem.
logel wrote:For me its absolutely unclear what kind of insight 10 more classifications should bring. I do not expect them to converge to a final classification.

denis_berthier wrote:This is what I've always said: there can't be any unique classification.

logel wrote:This was not exactly my point. Its obvious that different classifications yield different result. But what then?
I can can see no good concept to compare them besides some "statistics" on more or less biased samples. That looks not very attracting, at least for me.
denis_berthier wrote:It seems you haven't understood my previous answer about the controlled-bias generator: it does allow to get totally unbiased results.
As for comparisons that can only be based on statistics, one may like it or not, but this is unavoidable. All these ratings (as all the other known ratings) are meaningful only statistically. But statistics are based on perfectly sound mathematical concepts.

No, I perfectly understand your position. I can compare correlations between statistics, but I have no idea what is to learn from the comparison result.
Another point is that your concept is going for simple rule sets and is not aiming at simple solution paths ( whatever notion of simplicity one might use). On the road to simple resolution rules you already arrived at the bottom, as obviously T&E(2) with only single eliminations can solve any 9-9-Sudoku. Any other rule set just hides recursion steps inside the rule definitions and therefore shift some Sudoku to T&E(1).

The nice example of ronk some posts before can illustrate this.
ronk wrote:As for me, I'll call it an sk-loop, a variant within the spirit of that rule.

Code: Select all
98.7.....7.6...8...54......6..8..3......9..2......4..1.3.6..7......5..9......1..4 # 7658;GP;H1521


____Image


By the way, any side blows with spirits, ghosts and precise definitions are uncalled-for. The only important question is, whether this pattern has any value for a solution.
The xsudo diagrams are pretty but do not help much understanding the logic. At first the ravel will be converted in a more structured but less colourful version.
Code: Select all
     3 125 7 459 6 125 8 249
     c bbb r bbb c bbb r bbb
     4 555 4 666 7 999 7 888

r5c4 X X X
r6c4 X  XX
r4c5   XX  X
r4c6    XX X

r4c8       X XX
r4c9       X  XX
r5c7         XX  X
r6c7          XX X

r8c7             X XX
r9c7             X  XX
r7c9                XX X
r7c8               X X X

r7c5                   X XX
r7c6                   X X X
r9c4 X                   X X
r8c4 X                   XX

You can classify this as an 16*16 Dirichlet matrix containing exactly 16 true candidates and exactly one in each row or column. This justifies all eliminations.
But there is more. Sudoku matrices of that size always have substructures. This one decomposes into 4 blocks of 4*4 smaller matrices connected by 2-rings. So its some kind of loop of loops. This is the only similarity with the (original) SK-loop. If really required, one sure can find a generalization and a rule definition that covers both.
A contradiction path starting from 3s in column 4 (and any other) has no continuation with singles, even another recursion stops. So related to your classification, this pattern is T&E(3) and therefore has no value. You can find other patterns in T&E(2) doing the job. The price to pay is a longer sequence of elimination.
On the other hand there is a contination resulting in contradiction with a 4*4 matrix repeated four times.
If you enhance your rule set with pairs, this pattern will shift to T&E(2), enhancing with 4*4-patterns to T&E(1). This is meant by "hiding" recursion levels. In your terminology its solvable by S2-braids or B2-braids. The matrix shows why and is more visual.

Conclusion: The pattern ronk displays may have a value for solution. I dont share your view that a T&E(2) pattern is generally more complex than a T&E(1) pattern, only because it contains one or two extra branches. The braid hierarchy B(n)B is fine for defining sets of solvable Sudoku, but questionable as a complexity scale.
User avatar
logel
 
Posts: 55
Joined: 29 April 2012
Location: Germany

PreviousNext

Return to Advanced solving techniques