The New Sudoku Players' Forum

by AW » Wed Feb 14, 2007 5:36 am

First off, let me state my question. Then, I will go on at length about the question's context, which will hopefully serve to clarify the question a little. Indeed, I understand that the question is very general and rather vague. It is this : "In sudoku, by what other ways can we find that (P implies Q) ?"

The question follows from a neat little thing that happened last night, after I finally had enough time to finish recoding my little sudoku solver. It came up with two nice little lines of output :

When confronted with :
100007090030020008009600500005300900010080002600004000300000010040000007007000300

Solver answered with :
162857493534129678789643521475312986913586742628794135356478219241935867897261354

And when confronted with :
001002000030040050500600300007000006080000040200000100006009007040070080000500900

Solver answered with :
871352469632948751594617328457231896189765243263894175326489517945173682718526934

Although many of you may consider this somewhat "dépassé", I consider it something of an achievement given the original approach Solver uses. Those of you who know these puzzles by their little names will have recognized "ArtoI's AI Etana" and "Ocean Anemone", and perhaps even noticed that Solver's answer was correct.

Sadly, Solver still fails to make good on tougher puzzles. Hence, my question. Now, to properly answer it, one must know a little more about how Solver goes about his business.

Solver is deceptively stupid. However, one thing Solver is not allowed to do, is "take a guess". Solver may only play moves which he knows, without a shadow of a doubt, to be correct, and Solver is never ever allowed to take back a move he has made.

Solver uses one main routine, and two simple upkeep routines. The first simple upkeep routine is to check for patterns which most if not all of you are very familiar : finding out naked or hidden singles, pairs, triples, quads, and finding out x-wings, swordfish and jellyfish. Solver considers these to be valid (and I daresay he's quite allowed to!), even though most of these patterns are completely alien to his main routine. These patterns, once checked, just make the main routine a little shorter to go through.

Solver's main approach, seeing as he is so deceptively stupid, is nothing less than brute-force. That's because Solver can't really tell whether what he's doing is really useful or not. It doesn't matter much to him how long it takes, as long as he gets there. All he needs to know is that a move is allowed. And even though Solver's approach is definitely brute-force, it is nonetheless purely deductive. And, mind you, Solver does come up with an answer in a sufficiently short amount of time for my tastes (i.e., under a minute).

Before I go into how Solver works, I need to take a little time to explain how Solver sees the problem of sudoku. Many of you may already be familiar with Solver's simplistic vision, which can be summarized by these two main tenets :

(A)
Solver considers the 9x9 puzzle to be a question about 729 "candidates", each "candidate" being a proposition (for lack of a better term) that says "value X placed in the cell (row Y, column Z) is true". Solver's objective is to show which 81 of these propositions are actually true.

(B)
Solver considers that these candidates are under 324 basic "constraints". 81 of these constraints are "row-value" constraints, 81 are "column-value", 81 are "box-value" and 81 are "cell" constraints. Each of these basic constraints governs, at the offset, 9 candidates. And each of these constraints has this to say about its nine candidates : "exactly one of my candidates is true" (for example : of the nine candidates that have the value X along row Y, exactly one is true). Therefore, another way of putting Solver's problem is, to show which candidate is actually true for all 81 constraints.

Now then, here is how Solver goes about his business :

1. After having run through his first simple upkeep routine stated above, Solver's introdution consists in identifying all the implications that are known "directly" from the constraints. These are :

1.a) Each candidate trivially implies itself. Thus, Solver notes for all remaining candidates, (X implies X) and (~X implies ~X).

1.b) Any two candidates that share a constraint exclude each other. This is given directly from the definition of the basic constraints : exactly one of the candidates is true. Therefore, for any two candidates X and Y that share some constraint, (X implies ~Y).

1.c) (edit...) Forgot to mention the very important point that Solver also notes, for any constraint having only two unsolved candidates X and Y remaining, that (~X implies Y).

(Although I may forget to mention it occasionnaly, let it be known that when Solver identifies any implication, he always also take note of the "contrapositive". The contrapositive of (X implies Y) is (~Y implies ~X).)

And now, Solver goes into his main routine :

2. "Collapse" all the implications he's found out about so far. Basically, at this point, Solver simply infers from his known combinations of ((X implies Y) and (Y implies Z)) what necessarily follows, that (X implies Z). Solver also goes through all the inverse/converse/whateververse of this, such as that from ((~X implies ~Y) and (~Y implies Z)), it necessarily follows that (~X implies Z). He does this until he can no longer identify any new relationships.

3. Use the force, Solver : given all the implications Solver has worked out, Solver makes any and all direct deductions and eliminations he can. These are all of the form : from ((X implies Y) and (~X implies Y)) it follows that (Y), and from ((X implies ~Y) and (~X implies ~Y)), it follows that (~Y).

(Those of you familiar with this approach will know that at this point, solver has worked out all "simple" chains. "Simple" in this case involves a lot of somewhat complicated stuff, such as many types of colouring for instance.)

If Solver finds anything worthwhile during step 3, he will execute his simple upkeep routine and then start back at 2. Otherwise, he moves on to...

4. Solver then checks for something unusual, which is actually a very limited generalization of step 3 : since he knows that exactly one candidate in a constraint is true, he can deduce readily that if a candidate X exists, such that X "sees" all the candidates Y of some constraint, then X must be false. "Sees" here simply means Solver has worked out that (X implies ~Y) for all those candidates. Solver also checks whether he can do the same thing with (~X implies ~Y), which will prove that X is true.

(This happens to be a generalization of what is often called "line-box reductions", but of course taking in much more than just lines and boxes. By the time Solver gets here, he may know that for a given Y, (X implies ~Y) only through some arbitrarily long chain of implications.)

Again, if something worthwhile happens, it's upkeep time and back to 2. Otherwise...

Here, Solver does something very unnecessary, but very cute. Which helps him tame the rather large number of combinations he's dealing with. It's the simple second upkeep routine, where Solver gets rid of redundant constraints and candidates. This operation does not add any information to what Solver knows, and therefore doesn't help directly, so I won't go into detail about it. It sets the ground for some serious candidate vs. constraint checking...

5. If Solver gets here, which he does (more than once, even!) for the two puzzles mentioned above, he's very desperate! Solver is now looking for candidate pairs for which he can identify a hitherto unknown relationship. This is actually a variant of 4, in which solver attemps to lock out a full constraint, but this time by using two candidates instead of one. While he's doing this, he will also identify a simpler form which presents itself at the same time. The simpler form is :

5.a) If some candidate X "sees" all but one candidate Y of some constraint, then Solver may infer that (X implies Y). A corresponding inference is also allowed if (~X) "sees" all but one candidate, for then (~X implies Y).

5.b) The more complicated case is when solvers finds that X "sees" more than one candidate in some constraint, but there are also more than one candidate that X does not "see". Solver then looks for another candidate, Y, such that Y "sees" at least all those candidates that X does not see. When he finds such a pair, Solver knows that (X implies ~Y). Solver also checks the other combinations, by trying to cover the constraint with (~X and ~Y), (X and ~Y) and (~X and Y), all of which lead to an inference, respectively : (~X implies Y), (X implies Y) and (~X implies ~Y).

If Solver finds anything, again, it's upkeep and back to 2.

Otherwise, Solver grumpily gives up the match.

But Solver would really really like to find more and different ways about how he can determine that (P implies Q). Could you help him ?

I like to delude myself into thinking that Solver deserves a little help, given what he has managed to achieve with so little!

And I will, of course, be glad to elaborate on what I've already said, rather quickly, if there are any questions about that.

by **ravel** » Wed Feb 14, 2007 12:59 pm

Hi AW,

your description reminds me on Trebor's Tables. Maybe you can point out, what there is the difference to make it clearer.

btw, have you already tried the new "Explainer cracking puzzle" by m_b_metcalf ?

Code: Select all: 500000009020100070008000300040702000000050000000006010003000800060004020900000005

by AW » Thu Feb 15, 2007 1:39 am

Thanks for the link, ravel. You're quite right, Trebor's tables uses the same approach. We're clearly not using the same language, but the general idea is similar. Remarkable thing, that he chose to "expand" where I chose to "collapse"!

Anyhow, the method described seems roughly the same up to step 4, which seems to correspond to "veracities". Step 5(a) may correspond to "Negative assertions" or close to it.

Something Solver does not explicitly do, is the following :

AMcK :
I also tried the suggested "unverity" and "unveracity" algorithm where any deductions common to at least two negative candidates in a cell or possibles in a house must be true. [Proof: since only one candidate can be true, at least one of any pair must be false and so any common implications must be from at least one correctly false assumption].

I had to think about that one for a bit, but it is implicitly checked through equivalent patterns. Taking the three implications alluded to : (~A implies C), (~B implies C) and (A implies ~B), Solver will work out through the contrapositive (B implies ~A) that both (B implies C) and (~B implies C) and therefore just plain (C).

I would say the big difference is in 5(b). I may have missed it, but nothing looks quite like it in that thread. And 5(b) is my first step at attacking Solver's blind side. As was discussed in another thread, (A implies B) needs to go beyond "simple" chains, of the ((A implies ~B) and (~B implies C)) kind, made up of simple "B"s.

I tried the puzzles posted int the tables thread. Solver got through all of them, including the rather difficult one (it took a few iterations) :

000070940070090005300005070087400100463080000000007080800700000700000028050268000

Also, there may be another big difference tonight, as I think I've found an interesting sword-fish like pattern to look for. I"m actually quite enthusiastic about it. It's an extension of 5(b) that uses a second constraint and some free candidates. We'll see if it works out.

Just for kicks, I gave Solver a run at that "Explainer cracking puzzle" by m_b_metcalf. I think it just set a new record for how fast Solver can give up on a grid

by AW » Fri Feb 16, 2007 2:51 am

With the new check, Solver answers...

514673289326189574798425361149732658687951432235846917453217896861594723972368145

I shall elaborate after dinner !

by AW » Fri Feb 16, 2007 4:19 am

The sword-fish-like pattern is an extension of the other patterns.

Basic implication markings (Step 1), only allow solver to cash in on "conjugate pairs" (i.e., constraints that have only two candidates left).

Step 4 checks for one constraint versus one candidate. This is an extension of the "conjugate pair" approach, setting it to work for constraints that have more than two candidates. If all the candidates in some constraint imply the same thing with respect to some candidate, then this thing must be true. This follows directly from the constraint definition, that exactly one must be true, thus exactly one of the implications will be verified.

Step 5a works adds a little more to this by drawing away from a full constraint lock. It checks that, for some constraint {A,B,C,...}, if some candidate X implies all of {B,C,...} are false, then the only way to verify the constraint, given X, is that A is true. Therefore, (X implies A). A corresponding check is also made where ~X implies all of {B,C,...} are false, which proves (~X implies A).

Step 5b goes a little further again. It allows for not just one free candidate, but any number of free candidates. However, there then needs to be a second candidate to lock the free candidates. Say, for constraint {A,B,C,D,E,...} we have X implies {A,B,C} are false, and Y implies {C,D,E,...} are false (X and Y can both cover some of the same candidates). Therefore, (X and Y) is impossible as that would show the constraint to be invalid, thus (X implies ~Y). There are four cases in all to check, from the combinations of X,~X and Y,~Y.

Each step is an extension of the previous one, slowly moving A away from B, as it were.

And now, the new pattern takes this further again, drawing A away from B a little more. What if A and B do not cover the constraint entirely? There will be a third group of candidates, those not covered by either A or B. And how will these be locked? By using a second constraint. Here's the comments I put in my code to jog my memory two weeks from now...

Code: Select all: // The sword-fish-like pattern this looks for is (A and B are candidates, x and y are constraints) : // // x1---x3---x2 // / / / // A / B // / / / // y1---y3---y2 // // Where : (A implies ~x1) and (A implies ~y1) // (B implies ~x2) and (B implies ~y2) // (~x1 and ~x2) implies x3 // (~y1 and ~y2) implies y3 // x3 implies ~y3 // ----------------------------------- // ~x3 implies (x1 or x2) // (x1 xor x2), actually // (x1 or x2) implies (~A or ~B) // ----------------------------------- // x3 implies ~y3 // ~y3 implies (y1 or y2) // (y1 xor y2), actually // (y1 or y2) implies (~A or ~B) // ----------------------------------- // ~A or ~B // ----------------------------------- // A implies ~B // B implies ~A // // x1, x2, x3, y1, y2, y3 can be groups of candidates, or just one candidate. // Permutations : A can imply ~x1 either by TF (A implies ~x1) or FF (~A implies ~x1), likewise for B. //

I wrote doen the derivation, just in case I forget it and start to think it's invalid some day. As can be seen, the pattern reduces to 5b when x3 is empty, and there is no longer any need for y. It reduces to 5a when x3 is empty and x2 contains only one candidate (B is then unnecessary). And to 4 when both x2 and x3 are empty.

I shall try a few more toughies to see how well Solver fares.

Then I will try to make Solver's data dumps comprehensible, so I can understand how it came to its conclusions. Sadly, being the nasty theoretical programmer that I am, user-friendliness tends to come last when I'm left to my own devices.

by AW » Fri Feb 16, 2007 4:53 am

Sovler comes through for the following grids, amongst others :

dml161
Ocean's Christmas present for gsf

And this one, "another from m_b_metcalf at SE >= 11.4" :
300000004080200070006000500010908000000060000000007020005000600090001080400000003

However, it fails with these two :

Ocean's New Year's present for RW
dml 1/07

I'm definitely going to try and find out why. They must have something very special about them!

by **ravel** » Fri Feb 16, 2007 9:29 am

AW wrote:Ocean's New Year's present for RW
dml 1/07
...They must have something very special about them!

Yes, they are the leaders in my current top list

Seriously, i would have expected, that Mike's puzzle would be the hardest for this approach, because from the chaining point of view it needs a harder step than those two (but is easier solved after that hard step). So i am curious, if you can find an explanation for that.

by AW » Sat Feb 17, 2007 4:48 am

{first paragraph inspired by broken program removed

}

Seriously, i would have expected, that Mike's puzzle would be the hardest for this approach, because from the chaining point of view it needs a harder step than those two (but is easier solved after that hard step). So i am curious, if you can find an explanation for that.

Interestingly, harder for chaining doesn't make much of a difference for Solver. Although Solver is definitely at home with simple chains, it goes well beyond those. The sword-fish-like pattern, for instance, involves two candidates, two constraints, and 5 or more implication lines (where each line can be a chain of indeterminate length). Since the program loops back and builds off previous discoveries, it could theoretically build a fish using a fish using a fish etc. Which is equivalent to... well, I'm not really sure, a rather complicated forcing net of some kind I suppose.

I worked a bit on Solver's output, and I get this for the "Explainer cracker" :

Code: Select all: Starting up Solver versus ... 500000009020100070008000300040702000000050000000006010003000800060004020900000005 Report : MakeSubstitutions : 238 candidates, : 244 constraints : result = 22 Report : AddSplitImplications : 230 candidates, : 230 constraints : result = 137 Report : MakeSubstitutions : 230 candidates, : 230 constraints : result = 6 Report : AddSplitImplications : 228 candidates, : 226 constraints : result = 4 Report : AddSplitImplications : 228 candidates, : 226 constraints : result = 2 Report : AddSplitImplications : 228 candidates, : 226 constraints : result = 1 Report : AddSplitImplications : 228 candidates, : 226 constraints : result = 1 Report : AddFishImplications : 228 candidates, : 226 constraints : result = 207 Report : CheckImplications : 228 candidates, : 226 constraints : result = 3 Report : SolveBasics : 228 candidates, : 226 constraints : result = 226 Report : AddSplitImplications : 225 candidates, : 226 constraints : result = 1322 Report : CheckImplications : 225 candidates, : 226 constraints : result = 3 Report : SolveBasics : 225 candidates, : 226 constraints : result = 226 Report : AddSplitImplications : 222 candidates, : 226 constraints : result = 11 Report : AddFishImplications : 222 candidates, : 226 constraints : result = 132 Report : CheckFullImplications : 222 candidates, : 226 constraints : result = 3 Report : SolveBasics : 222 candidates, : 226 constraints : result = 226 Report : MakeSubstitutions : 219 candidates, : 226 constraints : result = 1 Report : AddSplitImplications : 218 candidates, : 226 constraints : result = 550 Report : CheckImplications : 218 candidates, : 226 constraints : result = 1 Report : SolveBasics : 218 candidates, : 226 constraints : result = 226 Report : MakeSubstitutions : 217 candidates, : 226 constraints : result = 13 Report : AddSplitImplications : 210 candidates, : 220 constraints : result = 24 Report : CheckImplications : 210 candidates, : 220 constraints : result = 1 Report : SolveBasics : 210 candidates, : 220 constraints : result = 220 Report : AddFishImplications : 209 candidates, : 220 constraints : result = 527 Report : CheckImplications : 209 candidates, : 220 constraints : result = 3 Report : SolveBasics : 209 candidates, : 220 constraints : result = 220 Report : CheckFullImplications : 206 candidates, : 220 constraints : result = 3 Report : SolveBasics : 206 candidates, : 220 constraints : result = 220 Report : MakeSubstitutions : 203 candidates, : 220 constraints : result = 6 Report : AddSplitImplications : 199 candidates, : 218 constraints : result = 2323 Report : CheckImplications : 199 candidates, : 218 constraints : result = 6 Report : SolveBasics : 199 candidates, : 218 constraints : result = 218 Report : MakeSubstitutions : 193 candidates, : 218 constraints : result = 33 Report : AddSplitImplications : 170 candidates, : 208 constraints : result = 91 Report : MakeSubstitutions : 170 candidates, : 208 constraints : result = 16 Report : AddSplitImplications : 163 candidates, : 199 constraints : result = 2 Report : AddFishImplications : 163 candidates, : 199 constraints : result = 10388 Report : CheckImplications : 163 candidates, : 199 constraints : result = 156 Report : SolveBasics : 163 candidates, : 199 constraints : result = 0 End of story ... 514673289326189574798425361149732658687951432235846917453217896861594723972368145

Solver is making very slow progress, save for two little spikes, until it hits the big break : all of a sudden the fish pattern comes up with 10388 distinct new implications, which when checked immediately solve 156 of the 163 remaining candidates. I can only imagine how complicated the real grid-breaker was to come by : it's the fourth time the whole grid was scanned for fish patterns.

Solver comes up with a correct answer for every puzzle I've tried on your list so far, ravel, except the two previously mentioned. I can't even begin to imagine what kind of complexity the two remaining puzzle are hiding. Here's what comes out of Solver for "Ocean's New Year's present for RW" :

Code: Select all: Starting up Solver versus ... 000001020300040500000600007002000001080090030400000800500002000090030400006700000 Report : MakeSubstitutions : 232 candidates, : 240 constraints : result = 23 Report : AddSplitImplications : 224 candidates, : 225 constraints : result = 142 Report : CheckFullImplications : 224 candidates, : 225 constraints : result = 1 Report : SolveBasics : 224 candidates, : 225 constraints : result = 225 Report : MakeSubstitutions : 223 candidates, : 225 constraints : result = 9 Report : AddSplitImplications : 220 candidates, : 219 constraints : result = 51 Report : MakeSubstitutions : 220 candidates, : 219 constraints : result = 6 Report : AddSplitImplications : 218 candidates, : 215 constraints : result = 20 Report : AddSplitImplications : 218 candidates, : 215 constraints : result = 7 Report : AddFishImplications : 218 candidates, : 215 constraints : result = 269 Report : CheckFullImplications : 218 candidates, : 215 constraints : result = 1 Report : SolveBasics : 218 candidates, : 215 constraints : result = 215 Report : AddSplitImplications : 217 candidates, : 215 constraints : result = 68 Report : MakeSubstitutions : 217 candidates, : 215 constraints : result = 6 Report : AddSplitImplications : 215 candidates, : 211 constraints : result = 69 Report : CheckFullImplications : 215 candidates, : 211 constraints : result = 1 Report : SolveBasics : 215 candidates, : 211 constraints : result = 211 Report : AddSplitImplications : 214 candidates, : 211 constraints : result = 10 Report : AddSplitImplications : 214 candidates, : 211 constraints : result = 14 Report : AddSplitImplications : 214 candidates, : 211 constraints : result = 293 Report : CheckFullImplications : 214 candidates, : 211 constraints : result = 1 Report : SolveBasics : 214 candidates, : 211 constraints : result = 211 Report : AddFishImplications : 213 candidates, : 211 constraints : result = 160 Report : CheckFullImplications : 213 candidates, : 211 constraints : result = 2 Report : SolveBasics : 213 candidates, : 211 constraints : result = 211 Report : AddSplitImplications : 211 candidates, : 211 constraints : result = 53 Report : CheckFullImplications : 211 candidates, : 211 constraints : result = 1 Report : SolveBasics : 211 candidates, : 211 constraints : result = 211 Report : AddSplitImplications : 210 candidates, : 211 constraints : result = 48 Report : AddSplitImplications : 210 candidates, : 211 constraints : result = 26 Report : CheckImplications : 210 candidates, : 211 constraints : result = 1 Report : SolveBasics : 210 candidates, : 211 constraints : result = 211 Report : AddSplitImplications : 209 candidates, : 211 constraints : result = 1 Report : MakeSubstitutions : 209 candidates, : 211 constraints : result = 3 Report : AddFishImplications : 208 candidates, : 209 constraints : result = 24 Report : AddSplitImplications : 208 candidates, : 209 constraints : result = 10 Report : AddFishImplications : 208 candidates, : 209 constraints : result = 14 Report : AddSplitImplications : 208 candidates, : 209 constraints : result = 11 Report : AddFishImplications : 208 candidates, : 209 constraints : result = 21 Report : AddSplitImplications : 208 candidates, : 209 constraints : result = 1 Report : AddFishImplications : 208 candidates, : 209 constraints : result = 26 Report : AddSplitImplications : 208 candidates, : 209 constraints : result = 6 Report : AddSplitImplications : 208 candidates, : 209 constraints : result = 1 Report : AddFishImplications : 208 candidates, : 209 constraints : result = 7 Report : AddSplitImplications : 208 candidates, : 209 constraints : result = 4 Report : AddFishImplications : 208 candidates, : 209 constraints : result = 0 End of story ... 000001020300040500000600007002000001080090030400000800500002000090030400006700000

Solver embarrassingly sputters along until it gives up after trying the fish pattern for the eighth time. Of the 232 candidates left after kick-off, it has only found out 8 in all (substitutions don't count). "dml 1/07" is apparently even harder, as it manages to get even fewer candidates and implications worked out, and gives up after the sixth fishing expedition.

{edit : Incidentally, after fixing the broken back door finder, I did notice dml 1-07 takes the cake in this category, having only 2 whereas Mike's has 3 and Ocean's has a bunch (27).}

According to Solver, your rating system is spot on, since you give these two nasties the best marks so far!

{edit : removed the useless lines with zero results in the output to make it slightly more readable.}

by **Myth Jellies** » Sat Feb 17, 2007 7:04 am

AW, your logic sounds an awful lot like the Pattern Exclusion Rule and some of the algebraic logic I have for POM. http://www.sudoku.org.uk/discus/messages/2/298.html?1144239282

by AW » Sat Feb 24, 2007 5:54 pm

From what I gather, Myth, POM uses much more extended patterns than Solver. It was also suggested elsewhere that I try to work Braid Analysis into the program. Both of these methods are, however, beyond the limits I had set. Basically, Solver is limited to "pair analysis". Very thorough pair analysis.

Solver actually cheats a bit by checking for straightforward locked sets. But since those are well known constructs, and realtively easy to code, I figured it wasn't too much of a leap. From a logical point of view, whereas chains are very simple to express, something as apparently simple as a naked quad turns out to be very complicated.

Managing triplets (as in Braid Analysis) and even bigger tuples (as in POM) isn't allowed in this version. And neither is what I used in Solver's previous incarnation (constraint creation). Incidentally, after having worked through extending the sword-fish-like pattern to a rough jelly-fish form, I realize that it would probably have been a better idea to use this as a simpler, though less general, approach.

There are different ways to analyse sudoku in logical form, and different ways to go about solving a grid. And although these different ways may use different logical patterns, all have the same objective : finding the solution through a series of logical deductions.

For roughly the same reasons, extended contradiction nets were also out of the picture, in so far as they need anything more than pairs. Although Solver uses the "positive" expression : ((P implies Q) and (~P implies Q)) therefore (Q). The corresponding "negative" expression shows the contradiction and is completely equivalent logically (since it just expresses the same thing with the contrapositives) : ((P implies Q) and (P implies ~Q)) therefore (~P).

A complete contradiction net is easy enough to express in logical form (for instance, ~(P and Q and R and S and T)). But Solver was required to arrive at the "net" through a "complete deduction". This was just a very personal requirement, as I find it somewhat unsatisfying to be given some complex proposition without all the premises that lead to it. Mind you, they come down to the same thing in the end. I just find it impossible to "see" the contradiction when it goes beyond two or three candidates (at best). Thus, I gave Solver the additional ability to identify all the pressures that make each complex proposition true, in terms of a combination of two sets of basic propositions : one set containing any number of implications (simple "P implies Q" expressions), and the other set containing any number of constraint expressions (simple "exactly one of these is true" expressions).

The rough jelly-fish form now allows Solver to crack the two remaing hurdles, "dml 1/07" and "Ocean's New Year's Present for RW". In my next post, I'll explain what, to Solver's mind (and, I should think, to anyone's mind

), makes those two puzzles "a cut above the rest".

by AW » Sat Feb 24, 2007 7:02 pm

A little clarification is needed about how Solver deduces things, over and above the well know methods of "straightforward locked sets" and "chaining". As a very simple explanation, Solver just extends "chaining" to be able to cope with constraints covering more than two candidates. Many of you probably know much of what follows already, and may want to skip directly to the "bottom line".

The basic chaining rules require some "conjugate pairs". The main advantage of conjugate pairs is that they are easy to check and work with. You extend chains by collapsing propositions around conjugate pairs. Usually, this works as : given the conjugate pair (P xor Q) and the fact that Q shares a constraint with R (Q can "see" R), we get (P xor Q) and (R implies ~Q) and therefore conclude (R implies P). In this way, we build "paths" linking candidates together well beyond basic constraints.

To make use of these chains, we look for a "path" from one end of a conjugate pair (P) to a given candidate (R), and another "path" from the other end of the pair (Q) to that same candidate (R). These two paths usually show that (P implies ~R) and (Q implies ~R). Since P and Q are a conjugate pair, we know that (P xor Q), and therefore, (~R).

When Solver has exhausted this line of reasoning, he starts checking for paths from constraints that have more than two candidates. This is just an extension of the conjugate pair principle : if aome constraint tells us that (P or Q or R), and we have the paths (P implies ~S) and (Q implies ~S) and (R implies ~S), then we conclude (~S).

When pressed beyond this, Solver stops trying to conclude directly that some candidate is true or false. He starts searching for implications that he hadn't identified yet by checking more complex patterns. The first is an extension of the way conjugate pairs extend chains : given any constraint, say (P or Q or R), if it is known through some path that (S implies ~Q) and through some other path that (S implies ~R), then it is also known that (S implies P).

When this fails, Solver starts doing something that is much like an ALS search, except that he relies on chaining paths rather than just weak constraints. The "sword-fish-like" pattern is much the same as a "3 weak over 2 strong" ALS. This allows him to identify some odd (P implies ~Q) relationships, where P and Q are a pair of candidates that would exclude, through some combination of chaining paths, enough of the 2 "strong" constraints to make solving both impossible. Solver pretty much exhausts any of the possible "3 over 2" constructs (even taking care of cases when the strong constraints have more than three candidates each).

At this point, Solver could apparently master all but the two aforementioned puzzles. What makes these two exceptional is that they require checking for the "jelly-fish-like" pattern. This pattern is roughly the same as the previous one, except that it considers the equivalent of a "4 weak over 3 strong" ALS, using chaining paths. Solver's implementation doesn't work out all possible patterns of this type, but those he does check for are plentiful enough and are hopefully sufficient for any puzzle.

-- Bottom line -----------------------------------------------------------------

Even by iterating (constructing "3 by 2" patterns over each other a few times) Ocean's and dml's masterful grids could not be cracked. (Whereas, other very tough ones like Mike's required quite a few iterations, but eventually did cave in.) They big two seem to require finding at least one net of the "4 over 3" kind. This is really incredible when you think about it. The puzzles cannot be solved through extensive pair-checking, without resorting to looking at relationships involving groups of at least ten candidates (or thereabouts... not entirely sure what the bare minimum is here). That means the nets are spread really, really thin.

I also worked a lot on cleaning up Solver's output. I'll see if I can post some useful information about the hardest puzzles later. Solver does give an interesting description that helps to understand just why the tough ones are so tough.

by **ravel** » Sat Feb 24, 2007 9:42 pm

Are Solver's solutions (principally) the same for scrambled (isomorphic) puzzles (i think it is the case, when all of the possible implications for a level are gathered, before going back to lower levels) ? In this case, this could provide a wonderful new rating system

by AW » Sun Feb 25, 2007 12:43 am

ravel wrote:Are Solver's solutions (principally) the same for scrambled (isomorphic) puzzles (i think it is the case, when all of the possible implications for a level are gathered, before going back to lower levels) ? In this case, this could provide a wonderful new rating system

Very curious myself as to how it will treat isomorphic puzzles. I did correct something this week that should help in this regard. The program no longer marks its deductions as soon as it comes by them; instead it waits for a given search to finish before marking. That way, it no longer takes into account conclusions reached in the same pass. This makes for a more rigorous count of the steps required for the solution.

I would also think that, under a complete enumeration of possibilities, isomorphs should give the same results. But that's just intuition, as I have absolutely no proof of this. I do think that perhaps the "locks", being a positional pattern, might have an effect on this, so I added an option to disable the basic locked set search.

Would you have a few good isomorphs to check this ? I'm not very familiar with isomorph theory

, and I expect, if there is some difference, then some puzzles might show this more clearly than others.

by **ravel** » Sun Feb 25, 2007 2:22 pm

AW wrote:Would you have a few good isomorphs to check this ?

It is very easy to construct isomorphic puzzles. The equivalence operations described by Gordon Royle are:

Permutations of the 9 symbols,
Transposing the matrix (that is, exchanging rows and columns),
Permuting (ie. rearranging) rows within a single block,
Permuting (ie. rearranging) columns within a single block,
Permuting the blocks row-wise,
Permuting the blocks column-wise.

Blocks are bands or stacks, ie 3 boxes in a line (horizontally or vertically).

For your purpose most interesting might be to mirror the puzzle at the '/' diagonal and renumber the digits 123456789 to 987654321, because i suppose, you are scanning it from top left to right bottom and from 1 to 9.

by AW » Mon Feb 26, 2007 5:01 am

I tried to morph some puzzles as you suggested (diagonal and renumbering). Seems to work. I get exactly the same solution path so far. Obviously, the candidate references are different, but all the rest is identicial. At least for these three pairs :

Code: Select all: 000001020300040500000600007002000001080090030400000800500002000090030400006700000 Ocean's RW 000009300000070008060200050008000009070010060300000400400008000010020000005600070 Ocean's RW (morphed) 003000009400000020080600100200004000090800007005030000000900800000005030070010006 dml 1/07 400030001070000080002000900050006000900700000001020400000500007300010200000008060 dml 1/07 (morphed) 500000009020100070008000300040702000000050000000006010003000800060004020900000005 Explainer cracking 500000001080900030002000700060408000000050000000003090007000200040006080100000005 Explainer cracking (morphed)

As for using a similar solving method as a basis for a rating system, I suppose it could be helpful. Mind you, Solver is not very fast, when it has to repeatedly check for the sword-fish patterns. But then, I'm sure it could be optimized some, and having to run it only once probably makes up for everything else.

There would certainly be a good bit of work involved, though, if has to be applied to less difficult puzzles. Currently, being built for "simplicity", it doesn't make many of the usual pattern distinctions. It treats chains of all kinds in the same step, on the same level of "difficulty" (y-wings, "coloring", "super-coloring", etc.). And it considers simple "line-box reductions" as "full constraint exclusions", which it tries only after chaining fails.

One can definitely draw some good statistics from its solution path, though. Getting a full dump of all deductions would be pointless, seeing as it identifies thousands of implications for any puzzle. However, there's plenty to see just by looking at the operations and the number of results found on each pass.

Here are some agregated stats from the "dml 1/07" run :

Code: Select all: Operation Calls GoodCalls Deductions Implications Avg/Call Avg/GoodCall Givens 1 1 21 0 21 21 SetupCandidateTables 1 1 0 1259 1259 1259 SolveBasics 192 3 553 0 2.88 184.33 UpdateImplications 191 106 0 10773 56.4 101.63 CheckImplications 191 14 42 0 0.22 3 MakeSubstitutions 177 14 63 0 0.36 4.5 FullExclusionTrue 163 21 50 0 0.31 2.38 NearExclusionTrue 142 102 0 1877 13.22 18.4 NearExclusionFalse 40 1 0 1 0.03 1 SplitTrueTrue 39 21 0 198 5.08 9.43 SwordFishTrueTrue 18 17 0 519 28.83 30.53 JellyFishTrueTrue 1 1 0 218 218 218

And from the "Ocean's... RW" run :

Code: Select all: Operation Calls GoodCalls Deductions Implications Avg/Call Avg/GoodCall Givens 1 1 21 0 21 21 SetupCandidateTables 1 1 0 1287 1287 1287 SolveBasics 138 5 597 0 4.33 119.4 UpdateImplications 137 64 0 8035 58.65 125.55 CheckImplications 137 14 44 0 0.32 3.14 MakeSubstitutions 123 16 33 0 0.27 2.06 FullExclusionTrue 106 14 34 0 0.32 2.43 NearExclusionTrue 92 60 0 1400 15.22 23.33 SplitTrueTrue 32 18 0 255 7.97 14.17 SwordFishTrueTrue 14 13 0 374 26.71 28.77 JellyFishTrueTrue 1 1 0 167 167 167

And from the "Explainer cracker" :

Code: Select all: Operation Calls GoodCalls Deductions Implications Avg/Call Avg/GoodCall Givens 1 1 20 0 20 20 SetupCandidateTables 1 1 0 1317 1317 1317 SolveBasics 67 3 534 0 7.97 178 UpdateImplications 66 25 0 8252 125.03 330.08 CheckImplications 66 6 10 0 0.15 1.67 MakeSubstitutions 60 11 115 0 1.92 10.45 FullExclusionTrue 48 11 50 0 1.04 4.55 NearExclusionTrue 37 23 0 1384 37.41 60.17 SplitTrueTrue 14 10 0 349 24.93 34.9 SwordFishTrueTrue 4 4 0 489 122.25 122.25

Solver agrees with your assesment that the "Explainer cracker" should be harder if one is relying on chaining, as he only got 10 chain hits in all (CheckImplications line), whereas there were over 40 in the others. But it has an unusually large number of substitutions, which may come from a bunch incomplete, short links.

The agregate numbers only tell part of the story. The history of calls is very interesting too. After jelly-fishing, Solver breaks "Ocean's... RW" down quickly by simple operations. However, for "dml 1/07", even after the jelly-fish pass, there are a few more sword-fishing steps before completing. Fortunately it never goes jelly-fishing twice for either puzzle. And "dml 1/07" gets points for originality because it has a hit with "NearExclusionFalse" (Being a sucker for "competion", I left a few routines hanging around in the code, that are rarely if ever useful; I was amused to see one crop up here). There's also the fact that agregate numbers are inflated at the end of the solution. There are always large number of deductions and implications once the puzzle has "caved in".

The New Sudoku Players' Forum

Looking for inspiration on implication

Looking for inspiration on implication