The New Sudoku Players' Forum

by **Mathimagics** » Fri Aug 19, 2016 12:18 am

I've come up with a reasonably reliable rating system for Kakuro puzzles. It gives a profile of a puzzle with some vital statistics.

Here's an example for an ATK "Easy" puzzle:

Code: Select all: PID: ATK_E9831 (9 x 9) MRL = 8 ACRL = 8 NCELL = 52 (81%) fixed = 20 (USI) implied = 32 (domain test) total = 52 (100%) Rating = 1.0000 (avg cell NPV)

MRL is the maximum run length

ACRL is the average cell run length (for each cell the lengths of the horizontal and vertical run are added together)

NCELL is the number of white cells (with % of total cells). A 9x9 puzzle (on my scale) has 64 interior cells, so 52 is approx 81% of that)

fixed: this number indicates the number of cells whose values are fixed by way of having a unique sum intersection (USI). Thus a 4-cell run with sum 30 (S30, L4) intersecting with an (S9, L3) fixes a value of 6 at the intersection.
This number also includes cells for which fixed values in other cells leave just one free cell in any run, which means that cell can also be fixed.

implied: when all the fixed values have all been identified, we are left with multiple choices for the remaining cells. Some of these choices which might have been valid at the beginning can be eliminated by a simple process. Suppose D is an option for a given cell - is it still possible to form the corresponding H and V sums with a D in this position? The answer is "no" surprisingly often, it's just that computers are generally better at it. Any cells which become fixed after this domain shaving is repeated as often as possible are implied values.

Rating: this score is a simple average taken on the number of possible values (NPV) for each cell at the end of domain shaving. If we have fixed + implied = NCELLS (100%) then this value will be 1.

Here is a comparison of some samples of the 3 puzzle grades used at ATK (Easy, Medium, and Hard):

Code: Select all: (ATK samples) E1 E2 M1 M2 H1 H2 H3 Size = 9 11 13 10 14 13 13 NCELL = 52 82 104 63 139 116 118 fixed = 20 33 25 5 23 2 8 implied = 32 49 79 58 78 105 76 Rating = 1.0 1.0 1.0 1.0 1.7 1.15 1.45

ATK's "Easy" puzzles tend to have smaller grids together with a high number of fixed cells, which are then easily finished off by the domain shaving process. "Medium" puzzles have larger grids and/or less fixed values. The domain shaving is typically a bit more involved, but normally is sufficient to complete the puzzle without T&E. "Hard"puzzles have larger grids, less fixed values, and the simple domain shaving does not lead to completion, so the rating is typically > 1.

This rating method doesn't attempt to take into account available strategies such as hidden/naked pairs, implied sums etc., all of which can be used to further reduce cell domains. It does give a reasonable guide to the relative computational complexity of the puzzle.

The rating value of 1 does not alone make a puzzle easy (or even medium). The size of the grid has a major impact on the difficulty of doing the elementary domain-shaving. Consider this example from Conceptis:

Code: Select all: PID: CB049 (24 x 14) MRL = 9 ACRL = 11 NCELL = 251 (84%) fixed = 38 (USI) implied = 213 (domain test) total = 251 (100%) Rating = 1.0000 (avg cell NPV)

Despite the 1 rating, there are some key indicators that this is not an easy puzzle for the human solver. With just 38 fixed cells, that leaves 213 cells whose implied values have yet to be deduced, and which are less likely to be obvious to you, even if your computer has no problems with it.

In fact this puzzle is taken from the Conceptis "Absolutely Nasty Kakuro - Level 4" book, and all of these puzzles are generally acknowledged as in the "Very Hard" category (to which I can also attest). So large grids + small % of fixed cells usually indicates a challenge.

The usefulness of such a rating/profiling system is of course less for puzzle-solvers than it is for puzzle-generators. Kakuro puzzle generation is a tricky business, and this rating system is a reasonable predictor of just where on the difficulty scale any generated puzzle is likely to fall.

Still, if anyone would like a profile done on any puzzle, just point me to it and I'll oblige.

by **Mathimagics** » Fri Aug 19, 2016 12:38 am

A rating of 1 doesn't necessarily mean easy, as demonstrated above. But typically ratings will be < 2 for even the hardest puzzles one is likely to see in published form. My rating value tends to be like the Richter scale, ie. a 2 is likely to be 4 times as hard as a 1 on the same grid layout.

With this in mind, I present what is possibly the most diabolical Kakuro puzzle yet devised.

: Kakuro_Rating_Example.jpg (49.17 KiB) Viewed 10081 times

Here are the profiles for these 2 puzzles:

Code: Select all: PID: KT0707_MF_Easy (7 x 7) MRL = 6 ACRL = 10 NCELL = 34 (94%) fixed = 14 (USI) implied = 20 (domain test) total = 34 (100%) Rating = 1.0000 (avg cell NPV) PID: KT0707_MF_Hard (7 x 7) MRL = 6 ACRL = 10 NCELL = 34 (95%) fixed = 0 (USI) implied = 0 (domain test) total = 0 (0%) Rating = 6.2059 (avg NPV)

As you can see, the one on the left is a doddle, while the one on the right is, for all practical purposes, impossible. And yet it is a genuine puzzle with a unique solution. It did take several hours (days in fact) of CPU time to find.

A rating of 6.2, it just staggers the imagination!

And that's just a small grid - I can well imagine even more horrendous cases on larger grids. The point here is that there is no such thing as the "World's Hardest Kakuro Puzzle"!

By the way, this grid layout is, I believe, the absolute maximum density (NCELL) for its size. It has the bare minimum of black cells (well, I show them as grey, I do believe in saving toner!) that allow uniqueness of puzzle solutions.

by **joemcm** » Tue Feb 12, 2019 10:33 pm

Saw your post. Frankly, the math was over my head. But, I enjoy a challenge and saw your "diabolical". Worked it manually. Took a day working on and off. Although much of the work was trial and error, I was able to limit the possibilities with deduction, solving the lower half. That lead to more eliminations and the upper half. I'm sure you have the solution.
241735
169523
327 16
98 567
496378
571649
Regards

by **Mathimagics** » Fri Feb 22, 2019 3:56 am

joemcm wrote: I enjoy a challenge and saw your "diabolical".

Hi there!

It is not often we have visitors here in the Kakuro area, as you can see!

Well done for solving this by hand - and demonstrating that perhaps I should think carefully before applying labels like "impossible"!

If you can give some description of what particular deduction let you complete the puzzle, that would be nice.

Cheers!

by **morl** » Fri Feb 22, 2019 10:15 am

Ok, I got it, but wouldn't a puzzle that requires surface sums, like this one, get a rating <1?

Code: Select all: 00\00 05\00 09\00 00\00 00\00 00\11 ----- ----- 11\00 15\00 00\20 ----- ----- ----- ----- 00\00 00\00 00\09 ----- -----

by **Mathimagics** » Fri Feb 22, 2019 12:54 pm

Hi morl,

I had to search around a bit to find out what "surface sums" are:

denis_berthier wrote:"Surface sum" is a name I found on several Kakuro websites. I adopted it in my book and here, because it seems quite appropriate: it deals with the difference between the horizontal and the vertical sums of the white cells making a surface separated from the rest of the puzzle by one (or more) white cell(s).

Well, I would beg to differ with the word "appropriate" here! Surface doesn't make any sense at all, perhaps "area" or "region" is what he meant. You look for regions that are connected to the rest of the grid via a single cell, or maybe 2 adjacent cells, then you can use H/V sums to determine the value of the cell (single-cell disconnection case) or the sum of the cells (for a pair).

Google search "Kakuro surface sums" seems to find references only in this forum, and one for DB's book, so where are the "several Kakuro websites" he mentions?

Anyway, that's neither here nor there! Now that we have sorted that out, my comments on your little example are these:

(1) Why do say it requires "surface sums"? One can easily see that each cell has only 2 possible values, and indeed that each 2x2 area can be solved in 2 ways, so this example is not a "valid" puzzle, as it has 4 solutions.

(2) Having multiple solutions doesn't prevent the rating method from being applied. This method (see the definition above) uses a fixed set of "Simple Solving Techniques". It can't give a rating < 1 by definition, since each fixed/implied cell (we can call these "singles") has NPV = 1, and each unsolved cell at the end of the "singles" detection loop, has NPV of 2 or more. We simply take the average, which must be 1 or more. In your example, there are no singles, all cells have NPV = 2, and so the rating is 2.

Cheers
MM

by **joemcm** » Sat Feb 23, 2019 6:34 pm

Start with the obvious. Da:Db = 89 or 98. If Da = 8, col a = 1,2,3,4,6,8. If Da =9, col a = 1,2,3,4,5,9.
Row E is either 2,5,6,7,8,9 or 3,4,6,7,8,9.
The sum of rows DEF is 104. This requires the max values for all columns. Therefore, Da:Fa = 4,5,9 or 4,6,8 (both 18 total). Db:Fb = 7,8,9. Ec:Fc sum is 7. Dd: Fd sum is 14. De:Fe = 4,5,8 or 4,6,7 (both total 17). Df:Ff = 7,8,9.
I used the first two statements above to create four combinations. I then used T&E to test if De:Fe was 4,5,8. As 4s and 5s are common solo values for Ea and Fa, many possibilities are dismissed. It is also helpful that both 4,5,8 and 4,6,7 include a 4. There are other shortcuts that become obvious. The result is that it is not 4,5,8 and therefore must be 4,6,7.
So, the T&E begins again adding 4,6,7 as candidates. Within each combination I looked for values that could only be found in two cells. It progressed fairly quickly. Certain possibilities were eliminated because they did not produce unique solutions. For what ever reason, no tests resulted inconclusively. I suppose it is because none of the cells in the top three rows have any influence. After some grunt work you find the correct values
98_567
496378
571649
Then it's off to solving the top 3 rows. Row A = 1,2,3,4,5,7. Col a = 1,2,3. Col e = 1,2,3. Col f = 3,5,6. A few options get eliminated in the crossfire and Ce:Cf = 16 or 25. Col b is 1,5,6 or 2,4,6 or 3,4,5. Since either Ae or Be = 3 and either Af or BF = 3, Aa nor Ba can be 3. Therefore Ca = 3. More T&E. Trying 1,5,6 for col b has the advantage of no 4. This makes Ac:Ad = 47 or 74. Unfortunately, neither works. 642 is a good next choice as the 6 can only be Bb. Of the options for the remaining 2 and 4, one yields the correct answer. So, the top 3 rows are
241735
169523
327_16
And, that's it.
I don't like puzzles that require T&E. I much prefer using logic. But, this one left me no options. Logic still greatly reduced the number of options to try. Hope this is clear enough. I couldn't find a way to attach an image and I don't store images on the web.

by **joemcm** » Sat Feb 23, 2019 8:24 pm

I am attempting to upload an image. Hope it works.

by **Mathimagics** » Sat Feb 23, 2019 9:23 pm

joemcm wrote:Hope this is clear enough. I couldn't find a way to attach an image and I don't store images on the web.

It looks like you found the "Upload attachment" function!

Thank you very much for that walkthrough, I will examine it later, when I have a little spare time.

joemcm wrote:I don't like puzzles that require T&E. I much prefer using logic. But, this one left me no options. Logic still greatly reduced the number of options to try.

Indeed, I agree entirely! Most Kakuro's, especially on larger grids, are challenging enough without any need for T&E. On 24x16 grids, Conceptis's hardest puzzles are in fact mostly "singles only" according to my solver (rating 1!!), but they take hours to complete.

The "Diabolical 6x6" puzzle that you solved was never intended for P&P solvers, it came out of my research into the computational complexity of Kakuro's. Kakuro puzzles have far greater potential for extreme difficulty, even for solver programs, than can be found with Sudoku puzzles. And of course if it's true for computer-solvers, it's also true for human solvers, as a rule.

In an earlier thread (see here) I give some "diabolical" examples that were particularly challenging for my solver, some of which required several hours of CPU time before I introduced a "look-ahead" capability. And also consider that, as far as I can tell, nobody has come up with a faster solver than mine (this is not a boast, I have searched in vain long and hard for any trace of a competitive solver). No luck with that …

I was wrong to call this puzzle "impossible" for P&P solvers, but it does demonstrate the point - imagine if you will the same kind of puzzle on a 9x9 grid. There are examples of this size in my list on that thread, and I am fairly certain no human in their right mind would seriously consider taking them up. They are almost entirely T&E cases ... and the expected time goes up exponentially as you increase the grid size.

But they are all valid! Only one solution ...

by **joemcm** » Sat Feb 23, 2019 10:42 pm

Actually, once I adopted my 4 primary options, they became manageable. The key that allowed it to be solved was that the last 3 rows had to be maximum values. Then once completed, much of the top was also very limited. Increase those totals a bit and I doubt I could have ever completed it.

by **Smythe Dakota** » Sun Feb 24, 2019 12:00 pm

Mathimagics wrote: .... I had to search around a bit to find out what "surface sums" are:
denis_berthier wrote: .... it deals with the difference between the horizontal and the vertical sums of the white cells making a surface separated from the rest of the puzzle by one (or more) white cell(s).
Well, I would beg to differ with the word "appropriate" here! Surface doesn't make any sense at all ....

I have never understood the phrase "surface sums" either. I prefer to call those things "singularities" if there is a single cell doing the separating, "doubularities" if there are two, etc.

A singularity can also be called a gimme, since it immediately produces the value in the single cell. A doubularity can be either a "sum doubularity" or a "difference doubularity", depending on whether it produces the sum or the difference of the two cells. If the two cells are together, it's almost certainly a sum doubularity, but be careful if the two cells are at opposite corners of the small area being separated! More often, those are difference doubularities, but occasionally they are sum doubularities.

Of course, a tripularity can also be either a "sum tripularity" (producing the sum of the three cells) or a "difference tripularity" (p;roducing the difference between one of the cells and the sum of the other two). Etc.

Bill Smythe

by **Mathimagics** » Sun Feb 24, 2019 1:11 pm

Thank you Bill,

"Singularity" is good, "doubularity" is, well ... Ouch! 8-)

That's just my opinion, but in any case the issue here is not about terminology for particular instances or categories of the "solving technique", but a sensible name for the technique itself. It's not rocket science, but useful. Given a zone (or "region") that can be isolated by removing one or two white cells, then there is a technique by which these cells can be solved (single cell disconnection), or have their relationship determined (sum/diff) which can lead to actual candidate eliminations.

"Disconnection" is something any mathematician can identify with - the white cells in the grid are all connected (unless the grid design is sloppy!). In graph theory a vertex whose removal creates two disconnected graphs is called a critical vertex. There is a direct analogy between vertices and cells in a Kakuro grid (any grid really) and edges correspond to adjacency.

My suggestion is therefore "Disconnection Zones", a strategy that involves identifying these zones that have the disconnection property wrt 1 or 2 cells, and for which we can then describe a method by which the properties of the critical cell(s) are identified (ie taking the horizontal/vertical sums that apply to the zone and doing some basic arithmetic).

DZ can also be demilitarised zone, and that's kind of cute …

I do recall somebody actually describing this in a list of "Kakuro Solving Techniques", AND giving the technique a sensible name, but I can no longer find that link anywhere ...

by **why** » Mon Mar 25, 2024 1:05 pm

Hello! I'm a student from China, and I'm very interested in your code about rating system! Could you please share your code more? I would appreciate it if you can show it!

by **m_b_metcalf** » Tue Mar 26, 2024 8:45 am

why wrote:Hello! I'm a student from China, and I'm very interested in your code about rating system! Could you please share your code more? I would appreciate it if you can show it!

Dear Why,
Welcome to the forum, but please note that, very sadly, Mathimagics is no longer with us.

Mike

The New Sudoku Players' Forum

Kakuro rating system

Kakuro rating system

Small doesn't necessarily mean easy

Re: Kakuro rating system

Re: Kakuro rating system

Re: Kakuro rating system

Re: Kakuro rating system

Re: Kakuro rating system

Re: Kakuro rating system

Re: Kakuro rating system

Re: Kakuro rating system

Re: Kakuro rating system

Re: Kakuro rating system

Re: Kakuro rating system

Re: Kakuro rating system