Puzzles Usage in a Presentation of a New Rating of Puzzles

Everything about Sudoku that doesn't fit in one of the other sections

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby dobrichev » Tue Mar 03, 2015 1:34 pm

Thank you, Blue.

So, the anomaly is that involving subsets increases the rating, right?

The second and the third are subsets of the fourth puzzle. All they are solvable by 4 * ([singles +] guess + singles). Less clues trend to higher rating (as expected?).

The "m" is Monte Carlo, else all listed puzzles should have minimal backdoor of size m, right?
EDIT: Not if only bi-values/bi-locals are used for guessing.
dobrichev
2016 Supporter
 
Posts: 1863
Joined: 24 May 2010

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby blue » Tue Mar 03, 2015 5:33 pm

dobrichev wrote:So, the anomaly is that involving subsets increases the rating, right?

It's the only anomaly that i'm aware of -- I assumed it was what you were asking about.

dobrichev wrote:The "m" is Monte Carlo, else all listed puzzles should have minimal backdoor of size m, right?
EDIT: Not if only bi-values/bi-locals are used for guessing.

You're right about both the connection with backdoor size, and the likely disconnect due to the "bi-xxx" restriction.
These are the actual ratings, though -- not Monte Carlo.
They were for the case where bi-locals are ignored, and the focus is on bi-value cells.

I've just checked, and they were easy enough to rate without resorting to Monte Carlo, for methods involving bi-locals.
With no special filtering, the results are:

Code: Select all
      bi-values only
--------+---------+--------
95.7187 | 80.7856 | 93.1054 (m = 5,5,5)
20.3089 | 18.5035 | 13.5272
 9.7053 |  8.5848 |  8.1214
 6.7648 |  6.9055 |  6.8470

      bi-locals only
--------+---------+--------
97.9355 | 88.9861 | 87.9484 (m = 4,4,4)
17.6770 | 16.0562 | 12.8515
 9.8759 |  8.6638 |  8.0583
 7.1846 |  7.1060 |  6.6835

   bi-value/bi-local mix
--------+---------+--------
96.8746 | 88.3173 | 88.6827 (m = 4,4,4)
19.0067 | 17.2621 | 13.0576
 9.8419 |  8.6433 |  8.0806
 6.9908 |  7.0335 |  6.7358

This isn't as bad as what I was seeing for the puzzles at the top of tknottet's list.
With subsets, the ratings were actually lower using bi-locals. For me, that was a surprise.
(For the 2nd puzzle, the locked candidates rating also dropped when bi-locals were added).
blue
 
Posts: 1052
Joined: 11 March 2013

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby eleven » Wed Mar 04, 2015 5:08 pm

blue,

thanx for your sample ratings in the ER 7/8 area.

When i looked at some of them, i saw, that the rating is not that helpful as a had hoped. Because we are in an area here, where puzzles can be solved manually, it is more important, that a solving step does not need a big network, which hardly can be evaluated without writing it out. It does not mean much, that a puzzle has an average rating of only 2, it still can be much harder than one with 6.

So i think that tknottet's rating is most interesting for the very hard puzzles. E.g. it can give a good relation, how much harder they are than newspaper puzzles (if you restrict it to basic techniques).


btw. the SER 7.7 puzzle is extremely overrated, SE does not know empty rectangles (or grouped strong links), so it is rated as nishio.
eleven
 
Posts: 3173
Joined: 10 February 2008

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby dobrichev » Wed Mar 04, 2015 8:16 pm

eleven wrote:btw. an alternative to trying 3-candidate cells would have been to use the strong links (bilocation candidates - one of them must be true) instead (the minimum number of strong links in a puzzle i found was 4).
In fact it seems to be a better tactic, because if one is shown to be true, also other candidates in the cell are eliminated.

Fsss2 solver uses singles and T&E. The trial strategy is bi-value cell, if not found then bi-location candidate, if not found then 3-location candidate, etc. -- as proposed.
Experimentally I changed the precedence so that a bi-location candidate, if found, is tried before a bi-value cell. Tested 800K hardest puzzles up to the second solution. The is execution time increased from 65 to 115 seconds and the number of T&E from 102M to 170M. I can't measure closer equivalent of "steps".
Having no explanation why this happens, I measured 17-given puzzles which are easy to solve. The result is 0.2" and 190K T&E versus 0.3" and 315K T&E. Again losing strategy.
Does it mean that at first approximation the "pencilmarks" approach is more effective than "templates" approach?
dobrichev
2016 Supporter
 
Posts: 1863
Joined: 24 May 2010

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby blue » Thu Mar 05, 2015 5:32 am

dobrichev wrote:Experimentally I changed the precedence so that a bi-location candidate, if found, is tried before a bi-value cell. Tested 800K hardest puzzles up to the second solution. The is execution time increased from 65 to 115 seconds and the number of T&E from 102M to 170M. I can't measure closer equivalent of "steps".
Having no explanation why this happens, I measured 17-given puzzles which are easy to solve. The result is 0.2" and 190K T&E versus 0.3" and 315K T&E. Again losing strategy.

Would you try the same puzzle sets with a version that only looks at bi-value cells first, tri-value second, etc., etc. ?
There's a replacement for your FindHiddenCells() routine below, that would probably do the trick.
It should always return a non-empty cell mask. (It's untested).

The "T&E" numbers would be the most useful comparison, since the code below, would increase the times for puzzles where there are always bi-value cells to be found.

Hidden Text: Show
Code: Select all
void fsss2::findBiValueCells(bm128& all) const {
   bm128 sum0, sum1, sum2, sum3;
   sum0 = grid[0];
   sum1.clear();
   sum2 = sum1;
   sum3 = sum1;
   for(int d = 1; d < 9; d++) {
      bm128 carry = grid[d];
      bm128 tmp0 = sum0;
      bm128 tmp1 = sum1;
      bm128 tmp2 = sum2;
      sum0 ^= carry;
      carry &= tmp0;
      sum1 ^= carry;
      carry &= tmp1;
      sum2 ^= carry;
      carry &= tmp2;
      sum3 ^= carry;
   }
   bm128 _all = mask81;
   _all.clearBits(solved);
   if (!_all.isSubsetOf(sum3))
      _all.clearBits(sum3);
   if (!_all.isSubsetOf(sum2))
      _all.clearBits(sum2);
   if (!_all.isSubsetOf(sum1))
      _all.clearBits(sum1);
   if (!_all.isSubsetOf(sum0))
      _all.clearBits(sum0);
   all = _all;
}

If the situation improves, that code could be optimized somewhat, by not using the higher order 'sum' bits, until they're necessary.
blue
 
Posts: 1052
Joined: 11 March 2013

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby dobrichev » Sat Mar 07, 2015 4:13 pm

When limited to "sum2" your code gives almost same results as the bi-location check. The differences are in few percents in both directions for different puzzle categories.
For sum1 and sum3 results are much worse.
EDIT: When limited to sum1, or extended to sum3, the results are much worse.
dobrichev
2016 Supporter
 
Posts: 1863
Joined: 24 May 2010

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby eleven » Sun Mar 08, 2015 10:06 pm

What does this mean now ?
eleven
 
Posts: 3173
Joined: 10 February 2008

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby dobrichev » Mon Mar 09, 2015 5:18 pm

Finally, guessing bi-values then bi-locals is approximately equal to guessing bi-values then 3-values, and both approaches are good. Guessing bi-locals then bi-values is bad.
dobrichev
2016 Supporter
 
Posts: 1863
Joined: 24 May 2010

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby eleven » Mon Mar 09, 2015 11:17 pm

thx, i was'nt sure.
eleven
 
Posts: 3173
Joined: 10 February 2008

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby tknottet » Thu Mar 12, 2015 2:25 am

I've disclosed the new sudoku site which includes my hardest sudoku list.
The new url of the list is:
http://www.tknottet.sakura.ne.jp/sudoku/Difficulty.cgi?List=All
To gfroyle, if you're reading this, please allow me to use your minimum sudokus in this manner.

Here, I would like to write Improvement plan of my rating method.
The discussion in this thread is very useful for me. I thank all members who posted message(s) in this thread. After reading discussion, I decided to improve my rating method as follows.
(The purpose of improvement is to shorten calculation time for rating.)
[A]When there are no bi-value cells to choose from, a pair for trial is chosen from bi-local candidates.
[B]As the most extreme filtering, only one pair of the highest score is selected for trial in both cases of bi-value and bi-local.
The score is defined for a candidate at first. The score of a candidate(C0) is the number of candidates which are "directly eliminated" when C0 is asserted. Ci is "directly eliminated " when either "Ci is in the same cell as C0" or "Ci is the same digit as C0, and both are in a same unit(row/column/box)".
The score for a pair is defined as the summation of the scores of both candidates. (In case of bi-local, same candidate(s) may be directly eliminated in both assertion. (Each of) the candidate(s) is doubly counted.)
I expect that scores of naked pairs are low.
From a manual solver's point of view, the complete practice of this procedure is too complicated. But I believe that even for a manual solver it is better than random selection, to select a pair which seems to eliminate more candidates.
tknottet
 
Posts: 24
Joined: 15 February 2015

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby eleven » Sat Mar 14, 2015 2:57 pm

tknottet wrote:From a manual solver's point of view, the complete practice of this procedure is too complicated. But I believe that even for a manual solver it is better than random selection, to select a pair which seems to eliminate more candidates.

At least i would not choose a pair, if i cannot see some progress for both sides.

I had a look at suexrat9-10364. For me it seems to be easier to solve it with deviding it into the puzzles given by the possible permutations for line 4 (20) or 2 (50 puzzles - all but one without solution). (In the second case e.g. there is a bilocal for 5 in r48c3, if r2c3=129, which seems to be a good choice to proceed.)

I would be interested in overall ratings given by the ratings of these partial puzzles.
eleven
 
Posts: 3173
Joined: 10 February 2008

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby blue » Sun Mar 15, 2015 12:20 am

eleven wrote:I had a look at suexrat9-10364. For me it seems to be easier to solve it with deviding it into the puzzles given by the possible permutations for line 4 (20) or 2 (50 puzzles - all but one without solution). (In the second case e.g. there is a bilocal for 5 in r48c3, if r2c3=129, which seems to be a good choice to proceed.)

I would be interested in overall ratings given by the ratings of these partial puzzles.

Hi eleven,

Below are the partial puzzle ratings, and then overall "line" ratings, similar to the "cell" ratings from before.
The ratings are with T = singles+locked candidates+subsets, with no special filtering.

Code: Select all
   line 4 |   rating    sigma  m    M
----------+--------------------------
142568973 |  24.6631  16.8042  4  248
142865973 |  33.4254  37.7813  4 1814
149568273 |  10.3043   7.7587  2  160
149865273 |  13.1033   7.9028  4  176
241568973 |  24.5888  27.9426  2  524
241865973 |  12.0006  15.4766  2  666
245168973 |  36.6162  22.0275  4  410
245861973 |  32.6613  21.6666  4  538
249165873 |   6.7915   3.7853  2   46
249561873 |   4.1906   3.4178  1   46  *good
542168973 |  21.2867  16.5755  2  258
542861973 |  23.2987  25.5508  2  890
549168273 |  13.3861  13.7768  2  356
549861273 |  32.7921  25.7210  2  396
842165973 |  23.5256  14.2104  2  254
842561973 |  32.9327  15.5852  4  292
849165273 |  10.9635   6.0101  4  104
849561273 |  14.6184   8.0241  2  116
941568273 |   9.1234   7.1769  2  104
941865273 |  10.7258   8.6178  2  152
942165873 |   4.5163   3.4149  2   38
942561873 |   4.5163   3.4149  2   38
945168273 |  18.6104  16.1378  2  346
945861273 |  69.7074  38.6665  4  692
----------+--------------------------
overall   | 258.7696  76.3552  2 8688

Code: Select all
   line 2 |   rating    sigma  m    M
----------+--------------------------
412587936 |   0.0000   0.0000  0    0
412589736 |  26.7344  19.2088  2  344
412785936 |   0.0000   0.0000  0    0
415287936 |   0.0000   0.0000  0    0
415289736 |  20.7900  20.4506  2  660
415789236 |   0.0000   0.0000  0    0
417285936 |   6.1566   4.8290  2   78
417589236 |   3.9344   3.8097  2  120
419285736 |  12.5355   9.9054  2  204
419587236 |   0.0000   0.0000  0    0
419785236 |   0.0000   0.0000  0    0
421587936 |   0.0000   0.0000  0    0
421589736 |  64.8955  31.1426  4  676
421785936 |   0.0000   0.0000  0    0
425187936 |   0.0000   0.0000  0    0
425189736 |  69.8813  30.8376  6  702
425781936 |   0.0000   0.0000  0    0
427185936 |   6.4383   4.1433  2  172
427581936 |   6.1893   3.9973  2  174
429185736 |  16.7586  14.8537  2  342
429581736 |  16.5963  14.7227  2  356
451287936 |   0.0000   0.0000  0    0
451289736 |  17.1935  18.0174  2  400
451789236 |   0.0000   0.0000  0    0
452187936 |   0.0000   0.0000  0    0
452189736 |  24.8765  17.5985  2  322
452781936 |   0.0000   0.0000  0    0
457189236 |   4.5862   4.9364  1  158  *good
457281936 |   8.4392   7.5300  2  136
459187236 |   0.0000   0.0000  0    0
459281736 |  25.8601  18.6667  2  246
459781236 |   0.0000   0.0000  0    0
471285936 |   5.7085   4.2495  2   86
471589236 |   8.0607   6.4280  2  104
472185936 |   3.2152   2.5589  2   34
472581936 |   3.2152   2.5589  2   34
475189236 |   9.2306   8.2694  2  146
475281936 |   6.8497   5.8937  2   90
479185236 |   3.3425   2.5380  2   32
479581236 |   3.3425   2.5380  2   32
491285736 |  22.2053  15.9235  2  522
491587236 |   0.0000   0.0000  0    0
491785236 |   0.0000   0.0000  0    0
492185736 |  11.9162  11.4511  2  244
492581736 |  11.8903  11.4246  2  240
495187236 |   0.0000   0.0000  0    0
495281736 |  42.5412  26.1660  4  602
495781236 |   0.0000   0.0000  0    0
497185236 |   6.8462   6.4550  2  394
497581236 |   7.4156   6.6183  2  388
----------+--------------------------
overall   | 266.6158  72.6415  2 8088

For comparison, with that T, the normal puzzle rating was:

Code: Select all
rating : 678.4785
sigma  : 545.3484
min    : 3
Max    : 19531

BTW: For line 4, I had 24 possible fills, not 20. Did I miss something ?
blue
 
Posts: 1052
Joined: 11 March 2013

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby tknottet » Sun Mar 15, 2015 4:20 pm

eleven, thank you very much for showing me another way to shorten calculation time for rating.
eleven wrote:I had a look at suexrat9-10364. For me it seems to be easier to solve it with deviding it into the puzzles given by the possible permutations for line 4 (20) or 2 (50 puzzles - all but one without solution). (In the second case e.g. there is a bilocal for 5 in r48c3, if r2c3=129, which seems to be a good choice to proceed.)

I would be interested in overall ratings given by the ratings of these partial puzzles.

blue, thank you very much for fast calculation.
By my rating of line 4 partial puzzles, 22 of 24 partial puzzles are solved or found a contradiction by first T.
142865973's rating is 2.6667(it took few minutes), 245861973's rating is 4.7932(it took about 2 hours).
It seems that the partial puzzle ratings after dividing by a template method take less calculation time than current rating.
(As for suexrat9-10364, my selection will be not r4, but digit "6", because five 6 appear in clues and less than 24 permutations are valid for digit "6". eleven, do you have any reason why you select r4 or r2, not digit "6"?)

My plan on 12th to shorten calculation time for rating does not seem to be good. I shortened by this improvement at calculation time, but the rating result is not good.
Below is comparison between average rating of all bi-value cells and the rating of the highest score cell. (Bi-local rating is not yet implemented.)
I expected that rating value gets smaller than the average by selecting highest score cell, but considerable increase is seen with some puzzles.
Code: Select all
Average|HghScore|HS-Ave |puzzle
-------+--------+-------+--------
23.2402| BiLocal|       |GoldenNugget
22.6372| BiLocal|       |Kolk
22.340*| BiLocal|       |Patience
22.1194| 20.1000|-2.0194|Imam_bayildi
21.585*| BiLocal|       |Second_flush
21.067*| BiLocal|       |champagne_dry
20.1075| BiLocal|       |eleven212
19.9646| 21.8333| 1.8687|Discrepancy
15.6924| BiLocal|       |Red_Dwarf
15.5665| 10.5000|-5.0665|AI_WorldHardest:Everest(2012)
 9.9724| 10.0000| 0.0276|AI_WorldHardest2006Escargot
 7.3786|  7.5000| 0.1214|AI_WorldHardest2010
 4.6867|  7.0000| 2.3133|17Hints35410
 4.2386|  5.0000| 0.7614|17Hints35409
 3.4250|  4.0000| 0.5750|17Hints2919
 3.2352|  3.0000|-0.2352|17Hints41826
 3.1609|  1.5000|-1.6609|17Hints48126
 2.7083|  1.5000|-1.2083|JapaneseSuperComputer322
 2.7000|  1.5000|-1.2000|17Hints35043
 2.5700|  2.7500| 0.1800|17Hints41164
 2.2500|  1.5000|-0.7500|17Hints35045
 2.0222|  2.0000|-0.0222|17Hints24147
 2.0167|  2.0000|-0.0167|17Hints35044
 2.0000|  3.0000| 1.0000|17Hints35042
 1.9286|  1.5000|-0.4286|17Hints28653
 1.8500|  1.5000|-0.3500|17Hints24153
 1.8000|  1.5000|-0.3000|17Hints18573
 1.7727|  1.5000|-0.2727|17Hints42464
 1.7619|  1.5000|-0.2619|17Hints41972
 1.6875|  1.5000|-0.1875|17Hints32733
 1.6667|  1.5000|-0.1667|17Hints33052
 1.6667|  1.5000|-0.1667|17Hints4934
 1.6500|  3.0000| 1.3500|JapaneseSuperComputer313
 1.6351|  1.5000|-0.1351|17Hints41750
 1.6000|  1.5000|-0.1000|17Hints12538
tknottet
 
Posts: 24
Joined: 15 February 2015

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby eleven » Sun Mar 15, 2015 4:27 pm

blue,

thank you for the quick evaluation again.

Since i wondered, that you got so much guesses on average in the line 2 list, i went through the puzzles with a simple solver.
Here bilocal guesses can be more effective. Similar to naked pairs the bilocals should not share a box plus a line, which naturally makes them "weaker".

I found (since i did it manually, there might be a mistake, but this would not change much), that with the following 4 rules, in all puzzles trying one pair of bilocals/bivalues is sufficient - with one exception.
1. If there is a bilocal for 1 in r4, take it
2. If there is a bilocal for 5 in c3, take it
3. Look at r4c467. If there is an ALS with a single 5 in a bivalue cell, and the 5 directly leads to another one, choose this cell (exception 427581936, where you should choose 18)
4. Take the bivalue cell r7c7

This way in the worst case 50+2*29=108 (or 110 with 2 more guesses) tries are needed.

Hidden Text: Show
412589736 5r48c3
415289736 29r3c3
417285936 23r7c7
417589236 39r7c7
419285736 5r48c3
421589736 5r48c3
425189736 1r4c36
427185936 58r4c4
427581936 18r4c4
429185736 1r4c36
429581736 1r4c34
451289736 29r3c3
452189736 1r4c36
457189236 39r7c7 (9 solves)
457281936 23r7c7
459281736 1r4c36
471285936 5r48c3
471589236 5r48c3
472581936 5r48c3
475189236 39r7c7
475281936 23r7c7
479185236 5r48c3
479581236 5r48c3
491285736 5r48c3
492185736 5r48c3
492581736 5r48c3
495281736 1r4c34
497185236 5r48c3
497581236 58r4c6


blue wrote:BTW: For line 4, I had 24 possible fills, not 20. Did I miss something ?

No, obviously my count was wrong.



PS i expect now, that i can solve every puzzle with pencil and paper on a weekend (note that uniqueness methods are allowed here for the invalid puzzles ;) ). A copier would make this a bit less boring. And i hate the thought, that i could make a mistake in the "good" puzzle.
eleven
 
Posts: 3173
Joined: 10 February 2008

Previous

Return to General