## Puzzles Usage in a Presentation of a New Rating of Puzzles

Everything about Sudoku that doesn't fit in one of the other sections

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Thank you, Blue.

So, the anomaly is that involving subsets increases the rating, right?

The second and the third are subsets of the fourth puzzle. All they are solvable by 4 * ([singles +] guess + singles). Less clues trend to higher rating (as expected?).

The "m" is Monte Carlo, else all listed puzzles should have minimal backdoor of size m, right?
EDIT: Not if only bi-values/bi-locals are used for guessing.
dobrichev
2016 Supporter

Posts: 1622
Joined: 24 May 2010

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

dobrichev wrote:So, the anomaly is that involving subsets increases the rating, right?

It's the only anomaly that i'm aware of -- I assumed it was what you were asking about.

dobrichev wrote:The "m" is Monte Carlo, else all listed puzzles should have minimal backdoor of size m, right?
EDIT: Not if only bi-values/bi-locals are used for guessing.

You're right about both the connection with backdoor size, and the likely disconnect due to the "bi-xxx" restriction.
These are the actual ratings, though -- not Monte Carlo.
They were for the case where bi-locals are ignored, and the focus is on bi-value cells.

I've just checked, and they were easy enough to rate without resorting to Monte Carlo, for methods involving bi-locals.
With no special filtering, the results are:

Code: Select all
`      bi-values only--------+---------+--------95.7187 | 80.7856 | 93.1054 (m = 5,5,5)20.3089 | 18.5035 | 13.5272 9.7053 |  8.5848 |  8.1214 6.7648 |  6.9055 |  6.8470      bi-locals only--------+---------+--------97.9355 | 88.9861 | 87.9484 (m = 4,4,4)17.6770 | 16.0562 | 12.8515 9.8759 |  8.6638 |  8.0583 7.1846 |  7.1060 |  6.6835   bi-value/bi-local mix--------+---------+--------96.8746 | 88.3173 | 88.6827 (m = 4,4,4)19.0067 | 17.2621 | 13.0576 9.8419 |  8.6433 |  8.0806 6.9908 |  7.0335 |  6.7358`

This isn't as bad as what I was seeing for the puzzles at the top of tknottet's list.
With subsets, the ratings were actually lower using bi-locals. For me, that was a surprise.
(For the 2nd puzzle, the locked candidates rating also dropped when bi-locals were added).
blue

Posts: 702
Joined: 11 March 2013

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

blue,

thanx for your sample ratings in the ER 7/8 area.

When i looked at some of them, i saw, that the rating is not that helpful as a had hoped. Because we are in an area here, where puzzles can be solved manually, it is more important, that a solving step does not need a big network, which hardly can be evaluated without writing it out. It does not mean much, that a puzzle has an average rating of only 2, it still can be much harder than one with 6.

So i think that tknottet's rating is most interesting for the very hard puzzles. E.g. it can give a good relation, how much harder they are than newspaper puzzles (if you restrict it to basic techniques).

btw. the SER 7.7 puzzle is extremely overrated, SE does not know empty rectangles (or grouped strong links), so it is rated as nishio.
eleven

Posts: 1907
Joined: 10 February 2008

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

eleven wrote:btw. an alternative to trying 3-candidate cells would have been to use the strong links (bilocation candidates - one of them must be true) instead (the minimum number of strong links in a puzzle i found was 4).
In fact it seems to be a better tactic, because if one is shown to be true, also other candidates in the cell are eliminated.

Fsss2 solver uses singles and T&E. The trial strategy is bi-value cell, if not found then bi-location candidate, if not found then 3-location candidate, etc. -- as proposed.
Experimentally I changed the precedence so that a bi-location candidate, if found, is tried before a bi-value cell. Tested 800K hardest puzzles up to the second solution. The is execution time increased from 65 to 115 seconds and the number of T&E from 102M to 170M. I can't measure closer equivalent of "steps".
Having no explanation why this happens, I measured 17-given puzzles which are easy to solve. The result is 0.2" and 190K T&E versus 0.3" and 315K T&E. Again losing strategy.
Does it mean that at first approximation the "pencilmarks" approach is more effective than "templates" approach?
dobrichev
2016 Supporter

Posts: 1622
Joined: 24 May 2010

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

dobrichev wrote:Experimentally I changed the precedence so that a bi-location candidate, if found, is tried before a bi-value cell. Tested 800K hardest puzzles up to the second solution. The is execution time increased from 65 to 115 seconds and the number of T&E from 102M to 170M. I can't measure closer equivalent of "steps".
Having no explanation why this happens, I measured 17-given puzzles which are easy to solve. The result is 0.2" and 190K T&E versus 0.3" and 315K T&E. Again losing strategy.

Would you try the same puzzle sets with a version that only looks at bi-value cells first, tri-value second, etc., etc. ?
There's a replacement for your FindHiddenCells() routine below, that would probably do the trick.
It should always return a non-empty cell mask. (It's untested).

The "T&E" numbers would be the most useful comparison, since the code below, would increase the times for puzzles where there are always bi-value cells to be found.

Hidden Text: Show
Code: Select all
`void fsss2::findBiValueCells(bm128& all) const {   bm128 sum0, sum1, sum2, sum3;   sum0 = grid[0];   sum1.clear();   sum2 = sum1;   sum3 = sum1;   for(int d = 1; d < 9; d++) {      bm128 carry = grid[d];      bm128 tmp0 = sum0;      bm128 tmp1 = sum1;      bm128 tmp2 = sum2;      sum0 ^= carry;      carry &= tmp0;      sum1 ^= carry;      carry &= tmp1;      sum2 ^= carry;      carry &= tmp2;      sum3 ^= carry;   }   bm128 _all = mask81;   _all.clearBits(solved);   if (!_all.isSubsetOf(sum3))      _all.clearBits(sum3);   if (!_all.isSubsetOf(sum2))      _all.clearBits(sum2);   if (!_all.isSubsetOf(sum1))      _all.clearBits(sum1);   if (!_all.isSubsetOf(sum0))      _all.clearBits(sum0);   all = _all;}`

If the situation improves, that code could be optimized somewhat, by not using the higher order 'sum' bits, until they're necessary.
blue

Posts: 702
Joined: 11 March 2013

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

When limited to "sum2" your code gives almost same results as the bi-location check. The differences are in few percents in both directions for different puzzle categories.
For sum1 and sum3 results are much worse.
EDIT: When limited to sum1, or extended to sum3, the results are much worse.
dobrichev
2016 Supporter

Posts: 1622
Joined: 24 May 2010

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

What does this mean now ?
eleven

Posts: 1907
Joined: 10 February 2008

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Finally, guessing bi-values then bi-locals is approximately equal to guessing bi-values then 3-values, and both approaches are good. Guessing bi-locals then bi-values is bad.
dobrichev
2016 Supporter

Posts: 1622
Joined: 24 May 2010

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

thx, i was'nt sure.
eleven

Posts: 1907
Joined: 10 February 2008

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

I've disclosed the new sudoku site which includes my hardest sudoku list.
The new url of the list is:
http://www.tknottet.sakura.ne.jp/sudoku/Difficulty.cgi?List=All
To gfroyle, if you're reading this, please allow me to use your minimum sudokus in this manner.

Here, I would like to write Improvement plan of my rating method.
The discussion in this thread is very useful for me. I thank all members who posted message(s) in this thread. After reading discussion, I decided to improve my rating method as follows.
(The purpose of improvement is to shorten calculation time for rating.)
[A]When there are no bi-value cells to choose from, a pair for trial is chosen from bi-local candidates.
[B]As the most extreme filtering, only one pair of the highest score is selected for trial in both cases of bi-value and bi-local.
The score is defined for a candidate at first. The score of a candidate(C0) is the number of candidates which are "directly eliminated" when C0 is asserted. Ci is "directly eliminated " when either "Ci is in the same cell as C0" or "Ci is the same digit as C0, and both are in a same unit(row/column/box)".
The score for a pair is defined as the summation of the scores of both candidates. (In case of bi-local, same candidate(s) may be directly eliminated in both assertion. (Each of) the candidate(s) is doubly counted.)
I expect that scores of naked pairs are low.
From a manual solver's point of view, the complete practice of this procedure is too complicated. But I believe that even for a manual solver it is better than random selection, to select a pair which seems to eliminate more candidates.
tknottet

Posts: 24
Joined: 15 February 2015

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

tknottet wrote:From a manual solver's point of view, the complete practice of this procedure is too complicated. But I believe that even for a manual solver it is better than random selection, to select a pair which seems to eliminate more candidates.

At least i would not choose a pair, if i cannot see some progress for both sides.

I had a look at suexrat9-10364. For me it seems to be easier to solve it with deviding it into the puzzles given by the possible permutations for line 4 (20) or 2 (50 puzzles - all but one without solution). (In the second case e.g. there is a bilocal for 5 in r48c3, if r2c3=129, which seems to be a good choice to proceed.)

I would be interested in overall ratings given by the ratings of these partial puzzles.
eleven

Posts: 1907
Joined: 10 February 2008

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

eleven wrote:I had a look at suexrat9-10364. For me it seems to be easier to solve it with deviding it into the puzzles given by the possible permutations for line 4 (20) or 2 (50 puzzles - all but one without solution). (In the second case e.g. there is a bilocal for 5 in r48c3, if r2c3=129, which seems to be a good choice to proceed.)

I would be interested in overall ratings given by the ratings of these partial puzzles.

Hi eleven,

Below are the partial puzzle ratings, and then overall "line" ratings, similar to the "cell" ratings from before.
The ratings are with T = singles+locked candidates+subsets, with no special filtering.

Code: Select all
`   line 4 |   rating    sigma  m    M----------+--------------------------142568973 |  24.6631  16.8042  4  248142865973 |  33.4254  37.7813  4 1814149568273 |  10.3043   7.7587  2  160149865273 |  13.1033   7.9028  4  176241568973 |  24.5888  27.9426  2  524241865973 |  12.0006  15.4766  2  666245168973 |  36.6162  22.0275  4  410245861973 |  32.6613  21.6666  4  538249165873 |   6.7915   3.7853  2   46249561873 |   4.1906   3.4178  1   46  *good542168973 |  21.2867  16.5755  2  258542861973 |  23.2987  25.5508  2  890549168273 |  13.3861  13.7768  2  356549861273 |  32.7921  25.7210  2  396842165973 |  23.5256  14.2104  2  254842561973 |  32.9327  15.5852  4  292849165273 |  10.9635   6.0101  4  104849561273 |  14.6184   8.0241  2  116941568273 |   9.1234   7.1769  2  104941865273 |  10.7258   8.6178  2  152942165873 |   4.5163   3.4149  2   38942561873 |   4.5163   3.4149  2   38945168273 |  18.6104  16.1378  2  346945861273 |  69.7074  38.6665  4  692----------+--------------------------overall   | 258.7696  76.3552  2 8688`

Code: Select all
`   line 2 |   rating    sigma  m    M----------+--------------------------412587936 |   0.0000   0.0000  0    0412589736 |  26.7344  19.2088  2  344412785936 |   0.0000   0.0000  0    0415287936 |   0.0000   0.0000  0    0415289736 |  20.7900  20.4506  2  660415789236 |   0.0000   0.0000  0    0417285936 |   6.1566   4.8290  2   78417589236 |   3.9344   3.8097  2  120419285736 |  12.5355   9.9054  2  204419587236 |   0.0000   0.0000  0    0419785236 |   0.0000   0.0000  0    0421587936 |   0.0000   0.0000  0    0421589736 |  64.8955  31.1426  4  676421785936 |   0.0000   0.0000  0    0425187936 |   0.0000   0.0000  0    0425189736 |  69.8813  30.8376  6  702425781936 |   0.0000   0.0000  0    0427185936 |   6.4383   4.1433  2  172427581936 |   6.1893   3.9973  2  174429185736 |  16.7586  14.8537  2  342429581736 |  16.5963  14.7227  2  356451287936 |   0.0000   0.0000  0    0451289736 |  17.1935  18.0174  2  400451789236 |   0.0000   0.0000  0    0452187936 |   0.0000   0.0000  0    0452189736 |  24.8765  17.5985  2  322452781936 |   0.0000   0.0000  0    0457189236 |   4.5862   4.9364  1  158  *good457281936 |   8.4392   7.5300  2  136459187236 |   0.0000   0.0000  0    0459281736 |  25.8601  18.6667  2  246459781236 |   0.0000   0.0000  0    0471285936 |   5.7085   4.2495  2   86471589236 |   8.0607   6.4280  2  104472185936 |   3.2152   2.5589  2   34472581936 |   3.2152   2.5589  2   34475189236 |   9.2306   8.2694  2  146475281936 |   6.8497   5.8937  2   90479185236 |   3.3425   2.5380  2   32479581236 |   3.3425   2.5380  2   32491285736 |  22.2053  15.9235  2  522491587236 |   0.0000   0.0000  0    0491785236 |   0.0000   0.0000  0    0492185736 |  11.9162  11.4511  2  244492581736 |  11.8903  11.4246  2  240495187236 |   0.0000   0.0000  0    0495281736 |  42.5412  26.1660  4  602495781236 |   0.0000   0.0000  0    0497185236 |   6.8462   6.4550  2  394497581236 |   7.4156   6.6183  2  388----------+--------------------------overall   | 266.6158  72.6415  2 8088`

For comparison, with that T, the normal puzzle rating was:

Code: Select all
`rating : 678.4785sigma  : 545.3484min    : 3Max    : 19531`

BTW: For line 4, I had 24 possible fills, not 20. Did I miss something ?
blue

Posts: 702
Joined: 11 March 2013

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

eleven, thank you very much for showing me another way to shorten calculation time for rating.
eleven wrote:I had a look at suexrat9-10364. For me it seems to be easier to solve it with deviding it into the puzzles given by the possible permutations for line 4 (20) or 2 (50 puzzles - all but one without solution). (In the second case e.g. there is a bilocal for 5 in r48c3, if r2c3=129, which seems to be a good choice to proceed.)

I would be interested in overall ratings given by the ratings of these partial puzzles.

blue, thank you very much for fast calculation.
By my rating of line 4 partial puzzles, 22 of 24 partial puzzles are solved or found a contradiction by first T.
142865973's rating is 2.6667(it took few minutes), 245861973's rating is 4.7932(it took about 2 hours).
It seems that the partial puzzle ratings after dividing by a template method take less calculation time than current rating.
(As for suexrat9-10364, my selection will be not r4, but digit "6", because five 6 appear in clues and less than 24 permutations are valid for digit "6". eleven, do you have any reason why you select r4 or r2, not digit "6"?)

My plan on 12th to shorten calculation time for rating does not seem to be good. I shortened by this improvement at calculation time, but the rating result is not good.
Below is comparison between average rating of all bi-value cells and the rating of the highest score cell. (Bi-local rating is not yet implemented.)
I expected that rating value gets smaller than the average by selecting highest score cell, but considerable increase is seen with some puzzles.
Code: Select all
`Average|HghScore|HS-Ave |puzzle-------+--------+-------+--------23.2402| BiLocal|       |GoldenNugget22.6372| BiLocal|       |Kolk22.340*| BiLocal|       |Patience22.1194| 20.1000|-2.0194|Imam_bayildi21.585*| BiLocal|       |Second_flush21.067*| BiLocal|       |champagne_dry20.1075| BiLocal|       |eleven21219.9646| 21.8333| 1.8687|Discrepancy15.6924| BiLocal|       |Red_Dwarf15.5665| 10.5000|-5.0665|AI_WorldHardest:Everest(2012) 9.9724| 10.0000| 0.0276|AI_WorldHardest2006Escargot 7.3786|  7.5000| 0.1214|AI_WorldHardest2010 4.6867|  7.0000| 2.3133|17Hints35410 4.2386|  5.0000| 0.7614|17Hints35409 3.4250|  4.0000| 0.5750|17Hints2919 3.2352|  3.0000|-0.2352|17Hints41826 3.1609|  1.5000|-1.6609|17Hints48126 2.7083|  1.5000|-1.2083|JapaneseSuperComputer322 2.7000|  1.5000|-1.2000|17Hints35043 2.5700|  2.7500| 0.1800|17Hints41164 2.2500|  1.5000|-0.7500|17Hints35045 2.0222|  2.0000|-0.0222|17Hints24147 2.0167|  2.0000|-0.0167|17Hints35044 2.0000|  3.0000| 1.0000|17Hints35042 1.9286|  1.5000|-0.4286|17Hints28653 1.8500|  1.5000|-0.3500|17Hints24153 1.8000|  1.5000|-0.3000|17Hints18573 1.7727|  1.5000|-0.2727|17Hints42464 1.7619|  1.5000|-0.2619|17Hints41972 1.6875|  1.5000|-0.1875|17Hints32733 1.6667|  1.5000|-0.1667|17Hints33052 1.6667|  1.5000|-0.1667|17Hints4934 1.6500|  3.0000| 1.3500|JapaneseSuperComputer313 1.6351|  1.5000|-0.1351|17Hints41750 1.6000|  1.5000|-0.1000|17Hints12538`
tknottet

Posts: 24
Joined: 15 February 2015

### Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

blue,

thank you for the quick evaluation again.

Since i wondered, that you got so much guesses on average in the line 2 list, i went through the puzzles with a simple solver.
Here bilocal guesses can be more effective. Similar to naked pairs the bilocals should not share a box plus a line, which naturally makes them "weaker".

I found (since i did it manually, there might be a mistake, but this would not change much), that with the following 4 rules, in all puzzles trying one pair of bilocals/bivalues is sufficient - with one exception.
1. If there is a bilocal for 1 in r4, take it
2. If there is a bilocal for 5 in c3, take it
3. Look at r4c467. If there is an ALS with a single 5 in a bivalue cell, and the 5 directly leads to another one, choose this cell (exception 427581936, where you should choose 18)
4. Take the bivalue cell r7c7

This way in the worst case 50+2*29=108 (or 110 with 2 more guesses) tries are needed.

Hidden Text: Show
412589736 5r48c3
415289736 29r3c3
417285936 23r7c7
417589236 39r7c7
419285736 5r48c3
421589736 5r48c3
425189736 1r4c36
427185936 58r4c4
427581936 18r4c4
429185736 1r4c36
429581736 1r4c34
451289736 29r3c3
452189736 1r4c36
457189236 39r7c7 (9 solves)
457281936 23r7c7
459281736 1r4c36
471285936 5r48c3
471589236 5r48c3
472581936 5r48c3
475189236 39r7c7
475281936 23r7c7
479185236 5r48c3
479581236 5r48c3
491285736 5r48c3
492185736 5r48c3
492581736 5r48c3
495281736 1r4c34
497185236 5r48c3
497581236 58r4c6

blue wrote:BTW: For line 4, I had 24 possible fills, not 20. Did I miss something ?

No, obviously my count was wrong.

PS i expect now, that i can solve every puzzle with pencil and paper on a weekend (note that uniqueness methods are allowed here for the invalid puzzles ). A copier would make this a bit less boring. And i hate the thought, that i could make a mistake in the "good" puzzle.
eleven

Posts: 1907
Joined: 10 February 2008

Previous