Hard puzzle in Eppstein's paper

Advanced methods and approaches for solving Sudoku puzzles

Hard puzzle in Eppstein's paper

Postby Max Beran » Thu Aug 25, 2005 10:50 pm

In the "Exercise on xyz-wing and xy-chain" topic Scott H draws attention to David Eppstein’s paper “ Nonrepetitive Paths and Cycles in Graphs with Application to Sudoku “. In Figure 11 the author shows "one of the more difficult" puzzles. Nothing daunted, I thought I’d give it a go and succeded only in reaching the following point


after which there is quite a lot of tidying up possible through couples, locked rows and columns such that the cell with the highest number of candidates is r2c1 with 1, 2, 4, 7 and 9. But finally I reached an impasse. I’ve filtered for x-wings, swordfishes and xy chains, searched for hidden triples and quads but nothing.

According to the paper it is solvable using the techniques given there (but in very mathematical terms so not for ornery mortals). However it does imply that trial and error is not one of their techniques.

So is there anyone out there who can see how to solve this puzzle? Presumably Eppstein's so-called repetitive and non-repetitve cyles might play a part but their operation seems horrendous - are they part of the serious player's toolbag?
Max Beran
Posts: 57
Joined: 17 August 2005

Postby Scott H » Fri Aug 26, 2005 12:29 am

I've searched by hand (and a couple of of annotated grids) for awhile, but haven't seen anything yet. I trust there are forcing chains that crack this one. If they're pure chains of one of Eppstein's two flavors, they're very long. I suspect the cracking chain will be some combination of his two flavors (which he doesn't discuss much), and it still must be fairly long.

By the way, short "nonrepetive cycles" include xy-wings, which should be part of every ambitious solver's repertoire. And short "repetitive cycles" are a simple generalization of naked and hidden groups that's not restricted to a single row/column/box, which should also be in the basic repertoire (I'm not sure if it is yet). So Eppstein is just talking about certain forcing chains, wrapped in some math and algorithms terminology.
Scott H
Posts: 73
Joined: 28 July 2005

Postby Max Beran » Fri Aug 26, 2005 1:04 am

Except that the chains I've met and used up to now have all involved pairs whereas Eppstein's non-repetitive cycles can include cells with more than 2 candidates - the sole criterion is that there must be no more than 2 of the digit in a domain (like x-wings). I've had a look for them armed with the bilocation graph which starts looking like Spaghetti but slowly entire chunks are removed as they are dead ends or else adjacent edges have the same label.

Interestingly Angus' program handles the repetitive chain example using his "Colours" rule which to be honest I haven't really got to grips with (not sure if it's another name for the Jeff's xy-chain that I do understand.
Max Beran
Posts: 57
Joined: 17 August 2005

Postby tso » Fri Aug 26, 2005 2:47 am

The paper describes graphs that will solve puzzles but no *procedures* for finding these graphs. Though they talk about human solving techniques, it seems more aimed at programmers.

BIVALUE graphs -- which are what xy-type forcing chains are -- *are* often fairly straightforward to construct -- just draw edges between every two 2-candidate cells that share at least one cadidate that are in the same row, box or column. Once labeled, they often show a solution, or at least clearly that this method will not work in a particular case. I tried it for the given puzzle, and this method is no help *except* to prove that an xy typ forcing chain does not exist.

Depending on the puzzle, simply drawing the connections and labeling them may be easiest, in other cases, identifying the binary groups as I've described in other posts may be much simpler. In this case, binary groups didn't help at all.

The procedures for finding bivalue graphs aren't perfect, but they aren't that difficult and are fairly useful. I find far more straightforward than finding things like swordfish and hidden trips.


Most of the the paper is about BILOCATION graphs. These can use cells with ANY number of candidates. Cells are connected if they are the only two in a column, row or box that a particular candidate can go. As far as I can tell, it would be very, very tedious to create a complete bilocation graph of most puzzles. I don't even want to try for this one. Even if you cheat and use Simple Sudoku to filter, first the 1's, then the 2's, etc -- and then what? You'll have a rats nest of labled edges. We need a procedure to guide the solver in the logical creation of a *partial* graph. I mean, forget the last puzzle, see if you find it easy to find the bilocation graph for the example 7:

Code: Select all

{24}  {24}  {3}   {1}   {8}   {6}   {7}   {9}   {5}   
{5}   {9}   {6}   {4}   {7}   {2}   {3}   {1}   {8}   
{1}   {8}   {7}   {9}   {5}   {3}   {6}   {4}   {2}   
{9}   {7}   {58}  {3}   {2}   {1}   {58}  {6}   {4}   
{24}  {1}   {245} {8}   {6}   {49}  {59}  {3}   {7}   
{3}   {6}   {48}  {5}   {49}  {7}   {89}  {2}   {1}   
{6}   {34}  {1}   {7}   {49}  {5}   {2}   {8}   {39} 
{7}   {23}  {29}  {6}   {1}   {8}   {4}   {5}   {39} 
{8}   {5}   {49}  {2}   {3}   {49}  {1}   {7}   {6}

Does the graph jump out at you? It doesn't for me -- and I know where it is.

The paper describes the treasure -- now we need the map.
Posts: 798
Joined: 22 June 2005

Postby Scott H » Fri Aug 26, 2005 3:32 am

tso, I agree with you partly. Eppstein takes two steps toward human procedures for finding chains (and I'd say you've taken 3 steps, the last being the biggest).

Eppstein's two steps (both significant before you know to do them are):
1. identify a relevant or interesting set of edges to search, and
2. identify local rules for telling when two adjacent edges can be part of a forcing chain.

I'd say step 2 (which I generalize significantly here) is the key idea in Eppstein's paper, and it's a useful advance.

You're right that the the bilocation (strong edge) graph is a rats nest compared to the bivalue (xy) graph. That's because there are many more strong edges than xy edges. Harder to see and search, but more possibilities for deductions too.

Eppstein delegates step 3 (procedures for searching the graphs) to known graph algorithms. You and I are interested in a manual step 3.

I'll describe my attempt at a manual step 3 for strong edge graphs. Because the graph is so thick, I don't draw full edges between nodes. Instead I draw ticks at each edge end pointing to the other end of the edge. Since edges connect on candidates, I put the ticks on candidates, not just anywhere in the cell. Putting ticks on the candidates also shows the edge label, so it's easy to see legal pairs of inbound and outbound edges at each node.

My first attempt was to print a puzzle showing pencil marks. But I found it was still too cluttered, because I needed more space around the pencil marks for the ticks. So next I printed a large grid with blanks in unsolved cells, and manually transcribed into the grid only the most interesting candidates (usually just two per node). Since most ticks are horizontal or vertical, it helps to write the two candidates diagonally in a cell. I still use the first grid with pencil marks printed. That's where I note all the strong edge candidates (and I use filtering software to see those).

So my experience suggests that the bilocation (strong edge) graph is manually visualizable and searchable with a bit of effort and ingenuity. This puzzle is still hard though!:)
Last edited by Scott H on Fri Aug 26, 2005 3:25 pm, edited 1 time in total.
Scott H
Posts: 73
Joined: 28 July 2005

Postby 737 driver » Fri Aug 26, 2005 3:48 am

I've been playing with a spreadsheet to assist with sodokus, and anything wildly complex is way beyond what I can put into it, although it recognizes obvious solutions (naked singles & hidden singles) and can point out to me naked & hidden pairs & triples, although not entirely elegantly. One nice thing about it, though, is that with the simple methods, it iterates about 4 cycles deep, showing the consequences of the consequences of the consequences of your last move... and if one of those consequences is that a cell has zero legal moves left, you can see that immediately.

SO... it took me not all that long to solve the puzzle presented here, although at several points I was pretty clearly using what looks about like Nishio... if a cel has 1,2,3, and 4 as remaining candidates, plug in 1 & see if the spreadsheet flags any errors; if it does, I've eliminated 1 as a candidate, and if it doesn't, then I've haven't. Likewise with 2,3, and 4. It doesn't take me very many guesses to start narrowing things down to where I've got enough information to take over again with the usual methods and get a solution.

So, my (rhetorical?) question is this: if you're getting burried in forcing chains that are complex beyond mortal understanding AND require computations by the truckload, is that really any better than eliminating a candidates with a few well-placed guesses to see if they yield contradictions? I certainly don't like the thought of "solving" a puzzle entirely by guessing at it until I stumble upon the right answer, but looking at the immediate consequences of a few guesses at the point other methods offer nothing, seems more economical in both my time and the computing power required than some of the more inscrutable methods that have been offered. Any thoughts?

P.S. Thanks to all for many great discussions on this forum -- most educational to read!
737 driver
Posts: 4
Joined: 30 July 2005

Postby Scott H » Fri Aug 26, 2005 6:57 am

737, your spreadsheet sounds like a fun tool. There's no question a little trial and error is often more efficient, for both humans and computers, than some logic approaches. I think the tantalizing (for some) challenge is to understand the logic that gets the same result, then see if you can apply the logic in other instances without blind trail and error.

tso, when you say the bilocation chain in your simpler example above is hard, is that a trick question (the graph is actually a bit simpler than the bivalue graph)? Or is your point that the bilocation graph is hard to see for humans just looking at a sudoku? If the latter, you're certainly right. Practically speaking you need filtering software and a printed grid to mark up to see the bilocation graph.
Scott H
Posts: 73
Joined: 28 July 2005

It took me 5 hours..

Postby Big Blue » Fri Aug 26, 2005 7:48 am

...but eventually I have solved it by hand applying each of the three bilocation-graph methods exactly once. E.g. it took me ages to find the correct path to apply the conflicting paths rule (eventually I found it, it went over the whole Sudoku).
The rest was possible with standard advanced techniques.

The problem is that I am absolutely not used to this method, so when I drew all the bilocation edges and vertices I had a maze rather than a clear collection of graphs - the problem in this particular puzzle is not the lack of bilocation vertices, but their abundance. Also, I did not spot all the crucial bilocation vertices at once. Once I finally saw the three crucial paths I was very pleased with the puzzle:)

I think with a bit of practice one can certainly do better - I wasted e.g. nearly 3 hours looking for the wrong kind of graphs and getting contradictions, because I forgot that the paper explicitly states that each number may appear only once (with the exception of one instance) in order for the rules to apply. It is obvious, of course, but while being lost in the maze it wasn't...

Example where it works: if you have a closed path with labels 77486 then this path is useful and you know where to place the 7.

Example where it does NOT work: if you have, say, 5522..., then, whatever ... is, you cannot use such a path to deduce anything useful.

In practice I think it is advisable to draw the bilocation graph as late as possible, i.e., to exploit all other methods first (at least if you solve by hand; for a programme it might be more effective to start with the bilocation graph from the very beginning).

So in conclusion, I think the bilocation-graph rules are very powerful and will help to solve puzzles where otherwise you would have to guess - which is certainly very satisfying.

P.S.: The "bivalue graph" rules seem to be well-known to all advanced Sudoku players - they include x-wing, swordfish and forcing chains. But as far as I know the bilocation graph methods are rarely implemented in construction of Sudokus, solving by hand or even solving by computer programmes - so there is certainly a lot of unused potential to be tapped.
Big Blue
Posts: 28
Joined: 01 August 2005

Postby Max Beran » Fri Aug 26, 2005 10:12 am

Actually the bilocation diagram was not as totally horrendous as Scott H and Tso are making out. As I suggested in my email, I did do it (greatly assisted by Angus filtering), and certainly it began like a snakes and ladders board gone crazy. But whole chunks quickly dropped out due to dead ends and identically labelled edges to the same vertex.

While the relaxation from bivalue to bilocation increases the initial number of edges, the requirement for a closed cycle more than compensates. So the problem really boils down to a sensible approach to drawing in candidate edges in such a way as you can erase them and colour them up if they look halfway hopeful. I made liberal use of highlight pens.

Of course if the puzzle is solvable by Eppstein's published rules and neither bivalue nor bilocation work then that leaves his third rule which seems to be something of a catch-all for trial and error by another name. If anything it sounded like Nishio (sensu SadMan). But wiser heads contributing to this thread may elucidate.

As to the trial and error solution, this follows immediately and completely by making the right choice at r9c3 (which incidentally is a total irrelevance as part of any of the chains or cycles considered above).
Max Beran
Posts: 57
Joined: 17 August 2005

Postby Max Beran » Fri Aug 26, 2005 10:54 am

Sorry folks - it looks like you'd already addressed the substantive points from my last posting which I submitted before noticing the new stuff. Thanks especially to you Scott H for suggesting a much better way for identifying cycles on the ground as it were.

When you say, Big Blue, that you've solved the puzzle, are you referring to Tso's or the original one from Eppstein? If the latter , how about a little hint, eg which type of cycle you did first (presumably bilocation seeing as Tso confirms my impression that there aren't any bivalue ones) and just one cell on each of the cycles? Having written this surely a bivalue cycle is a subset of a bilocation one so that is the one to eliminate first.
Max Beran
Posts: 57
Joined: 17 August 2005

Weak hints

Postby Big Blue » Fri Aug 26, 2005 11:10 am

@Max Beran: I was referring to Eppstein's original puzzle.

I first used a closed cycle that determined the number in the first cell in box two, if I remember it correctly. This was sufficiently easy, at least by comparison to the other two path-searches (although, as I said, it took me still some time to find it, but I attribute this to the novelty of the method for me, and not to its "objective difficulty", if there is such a thing).


Another hint is that at one instance I had to invoke the conflicting path rule (that was the path that went around the whole Sudoku). Both ends of the conflicting paths had a square number as label (I am not saying which one, in order not to spoil the fun too much).

Finally, it should be noted that if you have different paths reaching from one vertex to another you are of course free to choose which one to go. This applies in particular to paths with two labels on an edge.

P.S.: All of the above refers to paths in the bilocation graph. I have not used the bivalue graph explicitly because this just corresponds to "well-known" advanced methods, IMHO.
Last edited by Big Blue on Tue Aug 30, 2005 4:13 am, edited 1 time in total.
Big Blue
Posts: 28
Joined: 01 August 2005

Postby Jeff » Fri Aug 26, 2005 8:11 pm

post cancelled
Last edited by Jeff on Sat Aug 27, 2005 7:03 am, edited 1 time in total.
Posts: 708
Joined: 01 August 2005

Postby angusj » Fri Aug 26, 2005 10:53 pm

Jeff wrote:Scott, Is this simplified notation clear enough. It's shorter to type. Please comment.

Personally I find it too brief to follow the logic.
Also, in your image above, why isn't the edge between cells r4c1 & r4c3 labeled 3 instead of 7?
Posts: 306
Joined: 12 June 2005

Postby Max Beran » Sat Aug 27, 2005 2:16 am

Big Blue (nice kind generous Big Blue)

So presumably I should be looking particularly at edges occupying r1, c4 or box 2 in order to perform an elimination in r1c4. Presumably also I should be looking for a cycle with a pattern of edges of the form ...a4a... or ...a7a.... as per Eppstein's paper. Well I've drawn diagrams galore and been back to the filter in Angus' program time and time again to check if I've missed anything. There's a 2 and a 7 edge in r1 and a 6, 7 and 8 labelled edges in c4, and a 7, 8 and 9 in box 2 but none have matching edges at their ends.

In fact I haven't found any edges of the required patterns anywhere on the diagram other than the trivial ones connecting naked doubles in c5 and r5.

So any chance of another hint?
Max Beran
Posts: 57
Joined: 17 August 2005

Postby Jeff » Sat Aug 27, 2005 2:28 am

post cancelled
Last edited by Jeff on Sat Aug 27, 2005 7:04 am, edited 1 time in total.
Posts: 708
Joined: 01 August 2005


Return to Advanced solving techniques