The New Sudoku Players' Forum

Website · by **denis_berthier** » Thu Jun 04, 2015 5:24 am

Mathimagics wrote:That was my gut feeling. Trouble is my solver (home grown using classical coding, ie. pretty basic) has far more difficuly with P103, so much that I was convinced I had a bug (or failure to adequately shave my domains).

My results do NOT mean in any way that your solver has a bug. Rating depends on the resolution model. If yours is different, then it's quite normal that you get different relative ratings.

Mathimagics wrote:You mentioned g-whips .... please don't. Is that another SAT or CSP package?

All my results are based on my own CSP solving approach, which I call pattern-based. I can't summarise it here, but you can have an idea by skimming my last book, in which there's also a chapter on Kakuro solving.
The software I developed to implement my approach is CSP-Rules. It is a generic "pattern-based" CSP solver. I have generic rules for whips, braids, g-whips, g-braids, Subsets, ..., all the patterns I introduced in my books.
I have implemented specific interfaces for Sudoku, Futoshiki, Kakuro, Numbrix, Hidato, map colouring and Slitherlink. I have also a few application specific rules for each of these applications (indeed, a lot of rules for Slitherlink).
The ratings I mentioned above are universal, in the sense that they are based on resolution rules meaningful and valid in any finite CSP.

Mathimagics wrote: Every SAT or CSP package I try to install (I'm on Windoze) creates a new set of nightmares for me. MiniSAT, Copris/Scala, and the latest thing I'm trying to decide on is Choco....

All these softwares are based on general CSP-solving strategies, the adaptation to CSP problems of DFS or BFS with additional treatment for constraint propagation.

The main conceptual difference with my approach is, mine is "pattern-based", i.e. each elimination step is based on a well-defined and meaningful resolution rule. As I always use my default simplest-first strategy, I also get the "simplest" solution - in the sense of simplest hardest-step.
The practical difference is, general solvers, like DFS, BFS, SAT, ... or yours (probably), are much faster. The problem they solve is "exponentially" simpler than the problem I solve (simultaneous solving + rating). I don't mean my approach is better: the goals are totally different. When you develop a generator, you need a very fast solver. Using my solver within a generator would be very inefficient.

by **Mathimagics** » Thu Jun 04, 2015 6:11 am

Oh, you've got a book, I'm sorry but I live in a dream world - my specialty for the past 5-7 years since retiring has been number theory (especially the generalised Pell equtaion) and developing fast algorithms for multi-precision computation of the standard functions (cos, exp, log, etc).

So I would love to get hold of your chapter on Kakuro, if that's at all possible! Maybe that'd suck me in to buying your book! 8-)

And I'd overlook your misspelling of my handle!

Well, anyway, I've learnt a few things in the week since I began this exercise, and one of them is just what you said :

denis_berthier wrote:When you develop a generator, you need a very fast solver. Using my solver within a generator would be very inefficient.

This became clear when I ran the Copris solver - if the puzzle is well-formed it takes a couple of seconds (99.9999% of that is cranking up the package) but give it an ambiguous definition, and ask it how many solutions there are, it dies in the derriere.

It is ok for simply testing for ambiguity, ie stop at 2, but that's hardly of any use.

It uses a fairly naive "all-different" and "all-addup-to" model, and feeds that into it's generalised puzzle engine. So I can modify the front end, but that's of limited use.

My results do NOT mean in any way that your solver has a bug. Rating depends on the resolution model. If yours is different, then it's quite normal that you get different relative ratings.

No, not a bug, just a sadly inadequate feature! 8-)

Still, early days ....

Website · by **denis_berthier** » Thu Jun 04, 2015 6:19 am

Mathimagics wrote:So I would love to get hold of your chapter on Kakuro, if that's at all possible! Maybe that'd suck me in to buying your book!
And I'd overlook your misspelling of my handle!

Apologies for the misspelling.
My last book is available in print form there: http://www.lulu.com/shop/denis-berthier/pattern-based-constraint-satisfaction-and-logic-puzzles/paperback/product-20516172.html
But you don't have to buy it: I made it fully available in pdf there: http://arxiv.org/abs/1304.1628

Mathimagics wrote:
My results do NOT mean in any way that your solver has a bug. Rating depends on the resolution model. If yours is different, then it's quite normal that you get different relative ratings.

No, not a bug, just a sadly inadequate feature!

I wouldn't say so. You can't hope from a solver that it be at the same time fast (as required by a generator) and able to produce ratings compatible with a pattern-based approach of solving: the two goals are incompatible.

by **Mathimagics** » Thu Jun 04, 2015 6:45 am

Thanks Denis!

Hmmm ... 500 pages, that'll take some digesting!

As far as the desired marriage of incompatible aims is concerned, well the generator and solver will just have to learn to get along! In other words, we'll have to compromise ...

Website · by **denis_berthier** » Fri Jun 05, 2015 6:21 am

Mathimagics wrote:my specialty for the past 5-7 years since retiring has been number theory (especially the generalised Pell equtaion) [...]

Is there any potential application to Kakuro ?

by **Mathimagics** » Fri Jun 05, 2015 7:58 am

Well I guess number theory is always relevant, but the Pell equation, I doubt it!

by **Mathimagics** » Sat Jun 06, 2015 5:49 am

My solver has finally solved P103 !!

After previous runs struggled for hours without finding a solution, it now completes in < 0.15s. What a relief!

I used better shot selection (which cell to attack next), and implemented a fast stack unwind procedure.

The shot selector picks

(1) any move which is implied (ie: mustbe(cell, value) or mustgo(value, cell) by the current state

(2) if no implied moves,it picks a free cell in the block (aka "run") which has the smallest number of free cells, and in which at least one cell has already been set. This tends to focus the solver on those moves which are likely to have the most impact.

When a bad move is detected, the stack unwinds directly to the last move that was not implied, which saves time because it doesn't have to restore the intermediate states.

by **Mathimagics** » Sat Jun 06, 2015 7:35 pm

Armed with an efficient solver, I decided to chance my arm with a generator based on the Markov chain idea.

The idea is simply to start with a known solution (ie. a unique solution, which I call state U), then perturb it minimally until another US is achieved, and so on until some desired is reached (eg. a US significantly different to the original). Intermediate states correspond to ambiguous solutions, aka state A).

My initial testing were somewhat discouraging, since I allowed it to pursue intermediate A states, ie U > A > A .. but I found that from a state A[k] the next state was generally worse.

So I then changed it to disallow intermediate A states, it would reject them, only proceeding when a U[k+1] can be formed by a single perturbation on U[k]. To my astonishment that worked a treat!

I ran it on atk H3712, and within a few minutes had a well-formed puzzle on the same grid, but with over 65% of the solution diferent to the original, you'd hardly recognise it.

Now I appreciate the fact that the puzzles are of unknown complexity, but based on the number of Solver steps required, (ie nodes visited in the search tree), the new puzzles are at generally of harder than the original.

I will include a sample image here. It would be interesting to have your system evaluate them - if you could specify a text format suitable for batch processing, that would be useful. I'll send you my own email address via PM.

by **Mathimagics** » Sat Jun 06, 2015 8:38 pm

For H3712, the average number of retries between each U state was only 3, which suggests that the solution space {U} for any given grid is remarkably well-connected.

Hopefully I can try and form some inferences on common properties of {U} with further experiment, which will assist with the explicit generation problem.

By the way, I recall from the Conceptis puzzles that one distinguishing feature of Level 4 compared to Level 3 was the progression from a general trend towards block-lengths of 7-8 (in L3) to 9 (in L4). This is very noticable when you finish the book of L3 and turn to L4. Block lengths of 7-8 carry so much more information.

by **Mathimagics** » Sat Jun 06, 2015 10:39 pm

For the record, most of the ideas about domain shaving in my Solver were inspired by this paper

Kakuro as a Constraint Problem

It's one of the few useful papers in the public domain (perhaps the only one). His demonstration that most well-formed puzzles can be machine-solved without doing a great deal of searching, given the right sort of constraint propagation and domain shaving. Even without a CSP, I think I've demonstrated that the same technique can be used in a classical programming environment.

Website · by **denis_berthier** » Sun Jun 07, 2015 6:04 am

Mathimagics wrote:a generator based on the Markov chain idea.
The idea is simply to start with a known solution (ie. a unique solution, which I call state U), then perturb it minimally until another US is achieved, and so on until some desired is reached (eg. a US significantly different to the original).
[...]
I ran it on atk H3712, and within a few minutes had a well-formed puzzle on the same grid, but with over 65% of the solution diferent to the original, you'd hardly recognise it.

Can you give some information on what can be changed in U and on the transition probabilities ?

Mathimagics wrote:based on the number of Solver steps required, (ie nodes visited in the search tree), the new puzzles are at generally of harder than the original.

Usoln_Pertn_6847 is harder than the original H3712. Indeed H3712 was in W6, while the new one can't be solved by whips of any length.

Mathimagics wrote:if you could specify a text format suitable for batch processing, that would be useful.

The format I'm currently using was developed because I needed to copy puzzles manually from graphical representations on various websites. In this context, I found it easier (less error prone) to separate horiz and verti data and to add some syntactic sugar (/, //). Moreover, this format allowed easy consistency checks. I display it on several lines for better readability, but in my Lispian syntax, it can be written on a single line as well; and several puzzles can thus be stored in a single file, one per line.
14 is grid size (non-square grids are currently completed by useless black cells but this could easily be changed).
First block of lines is horiz data, second block is verti data.

Usoln_Pertn_6847 is represented as:

Code: Select all: 14 11 . . . 20 . . . 34 . . . . . / 21 . . . 45 . . . . . . . . . / B 16 . . . . . 16 . . . . . B / 25 . . . . . 24 . . . B 5 . . / 18 . . . 45 . . . . . . . . . / 26 . . . . 8 . . 6 . . . B B / 28 . . . . 20 . . . 24 . . . . / B B 23 . . . 6 . . 11 . . . . / 45 . . . . . . . . . 13 . . . / 14 . . B 7 . . . 15 . . . . . / B 17 . . . . . 33 . . . . . B / 45 . . . . . . . . . 13 . . . / 33 . . . . . 7 . . . 14 . . . // 14 . . 16 . . . . 17 . . 15 . . / 39 . . . . . . . 25 . . . . . / 45 . . . . . . . . . 22 . . . / B B 15 . . 29 . . . . 6 . . . / 32 . . . . . B 28 . . . . . . / 12 . . . 9 . . . 23 . . . . B / 16 . . 33 . . . . . . . 4 . . / B 17 . . . . 13 . . . 20 . . . / 35 . . . . . . B 20 . . . . . / 22 . . . 14 . . . . 6 . . B B / 7 . . . 45 . . . . . . . . . / 17 . . . . . 40 . . . . . . . / 11 . . 4 . . 10 . . . . 12 . . //

Website · by **denis_berthier** » Sun Jun 07, 2015 6:17 am

Mathimagics wrote:For the record, most of the ideas about domain shaving in my Solver were inspired by this paper
Kakuro as a Constraint Problem
It's one of the few useful papers in the public domain (perhaps the only one).

It's also the only paper I know. There is much more information available on websites.

Mathimagics wrote:His demonstration that most well-formed puzzles can be machine-solved without doing a great deal of searching, given the right sort of constraint propagation and domain shaving.

Constraint propagation and progressive domain restriction are standard methods in the CSP world.
I don't think his "demonstration" bears on "most well-formed puzzles". He refers to a very small collection of puzzles, many of which come from Nikoli, a website known for having only very easy puzzles.

Mathimagics wrote:Even without a CSP, I think I've demonstrated that the same technique can be used in a classical programming environment.

The general idea of pruning a search tree (which constraint propagation amounts to) is a classical one.

Website · by **denis_berthier** » Sun Jun 07, 2015 6:32 am

I don't know what your generation goals are: trying to produce (hard but still) human-solvable puzzles or trying to produce the hardest puzzles.

One teaching from Sudoku[n] is, the larger the grid size, the harder the puzzles (in the mean).
For Sudoku[9], only one puzzle in about 3,000,000 cannot be solved with only one level of T&E. It means that almost all the puzzles can be solved with whips.
But, as n increases, this proportion not in T&E(1) increases very fast. For Sudoku [25], it is close to 0. It means that, apart from a few puzzles with some special application-specific pattern, almost no puzzle can be solved in T&E(1).
There is no formal definition of what can be solved by a human; but in Sudoku (apart from the above special cases), T&E(1) seems to be an upper boundary.

For Kakuro[14], one is probably closer to the Sudoku[16] or Sudoku[25] case than to the Sudoku[9].

by **Mathimagics** » Sun Jun 07, 2015 6:42 am

I take it that the one line per puzzle format omits the "\" and "\\"? So the line would have exactly 196 tokens?

I'll provide details of the perturbation types and probabilities soon.

Website · by **denis_berthier** » Sun Jun 07, 2015 6:54 am

Mathimagics wrote:I take it that the one line per puzzle format omits the "\" and "\\"? So the line would have exactly 196 tokens?

Indeed, no. There is no difference between 1 line or multi-line format. It was not "\" and "\\" but "/" and "//", but I could change if necessary, or even omit them.
For a Kakuro[n] puzzle, there are currently 1 + 2*(n-1)*(n+1) tokens (the first horizontal fully black line and vertical fully black column are omitted).

The New Sudoku Players' Forum

New Kakuro Generator/Site

Re: Relative Difficuly

Constraint Models

Re: Constraint Models

Relative Difficulty

Re: Constraint Models

Kakuro vs Pell

My Solver vs P103

Markov Chain Generator

Kakuro Properties

Simonis' Paper

Re: Markov Chain Generator

Re: Simonis' Paper

About valid Kakuro puzzles

Kakuro Generator

Re: Kakuro Generator