Puzzles Usage in a Presentation of a New Rating of Puzzles

Everything about Sudoku that doesn't fit in one of the other sections

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby JasonLion » Sun Mar 01, 2015 1:22 pm

dobrichev, my understanding is that fuss2 always works with bivalve cells for T&E. In the text you quoted, I was talking about using using bi-located candidates instead, houses that only contain two instances of pencil marks for a specific digit.

I am well aware of the advantages of pesudo-random choices over topmost choices. If you go back and look closely at the history of the various fast brute force solvers you will find that I was the first one to introduce a version of that particular optimization.
User avatar
JasonLion
2017 Supporter
 
Posts: 642
Joined: 25 October 2007
Location: Silver Spring, MD, USA

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby eleven » Sun Mar 01, 2015 3:59 pm

Since i was curious, how much better i could be than the average rating, using the same T&E strategy, but with "intelligent" selection of the T&E numbers, i made a test with Golden Nugget, using a solver which i allowed singles, locked candidates and subsets.

The result was quite disappointing (ok, i am not the best solver, and i was impatient soon from the tedious work):
I needed 62 guesses, compared to the 82, blue had calculated with 3-way guesses at the beginning.

What i tried to improve was to only take pairs of candidates, which in both cases would give me numbers, in order to assure, that each guess would make some progress.
At the beginning this was not possible, so i decided to take the pair with the most eliminations on both sides. Also later there was a situation, where in all cases at least one side produced a number.
I also tried to avoid to make a guess for digits, which only produce one more in an empty x-wing or 2 in a 6-cell pattern, but nothing else.
Pairs don't seem to be that bad, if they only share one house (not both a box and a line).

So i don't expect much better ratings for very hard puzzles now with bilocal candidates, probably it is better to combine them with the bivalues. Excluding 2-house-pairs, empty x-wings and 6-cell-patterns might make it slightly better.

In some puzzles of course the difference to blue's average rating will be much bigger (as i saw with the 17 clue sample).
eleven
 
Posts: 3097
Joined: 10 February 2008

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby eleven » Sun Mar 01, 2015 9:14 pm

There is an alternative method to attack very hard puzzle with newspaper techniques, also mentioned by mollwollfumble.

You choose a line with say at least 5 unsolved cells, which does not allow too much permutations. Then go through them one by one.
I always wanted to know, if it's better or worse than the 2-way-guesses.

So i started to do the next tedious session, and tried Nugget with this method, again using locked candidates and subsets (the newspaper techniques).
I choose column 5 and got 35 possibilities, 18 remained after applying the techniques (the others all leaded directly to a contradiction).
For only 2 of the 18 i made more than one guess then, for one 4, the other 2 (skyscraper would save at least 2 guesses).

This sums up to 57 guesses, very near to the 62 of the other method. It was easier to find the guessing candidates in these sub-puzzles, but it seemed to be longer to get to the contradictions.

So i guess, that it depends on the puzzle, which method would be faster manually.
eleven
 
Posts: 3097
Joined: 10 February 2008

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby blue » Sun Mar 01, 2015 9:49 pm

eleven wrote:I also tried to avoid to make a guess for digits, which only produce one more in an empty x-wing or 2 in a 6-cell pattern, but nothing else.
Pairs don't seem to be that bad, if they only share one house (not both a box and a line).

I wondered about that too ... if it was the pairs in two houses that were the "bad eggs".
blue
 
Posts: 979
Joined: 11 March 2013

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby blue » Sun Mar 01, 2015 9:58 pm

Here are the rating results with standard deviations ("sigma") and min/max counts (m,M) calculated.
The table got rather wide, so I left out the data for "singles + locked candiates + subsets + basic fish".
[ This is for the method where a cell with the minimum candidate count is selected ]

Code: Select all
          singles           |     +locked candiadtes      |          +subsets           |
----------------------------+-----------------------------+-----------------------------| puzzle
  rating     sigma  m     M |   rating     sigma  m     M |   rating     sigma  m     M |
----------------------------+-----------------------------+-----------------------------+------------------------------
858.9493  661.6812  4 27368 | 621.9575  489.8724  3 21756 | 678.4785  545.3484  3 19531 | Suexrat9-10364
338.0826  253.8447  4  9493 | 261.9336  187.5185  4  7586 | 174.8692  116.2923  4  5485 | Fata_Morgana
106.9735   68.5786  4  3121 |  97.9841   64.4078  4  2949 |  68.5982   51.2504  3  2257 | Easter Monster
----------------------------+-----------------------------+-----------------------------+------------------------------
150.7094   88.3056  3  4953 | 123.7298   72.1465  3  4304 |  82.3268   51.6805  3  3372 | GoldenNugget
147.6405   87.8914  4  4194 | 123.2902   73.8630  3  3622 |  85.7411   54.3091  3  2880 | Kolk
145.6346   95.5219  4  3858 | 125.0115   83.2940  3  3506 |  85.0152   56.3454  3  2332 | Patience
139.1479   67.2834  6  2732 | 108.1807   50.4810  6  2540 |  77.7374   37.7302  4  1866 | Imam_bayildi
145.5524  112.2012  5  3201 | 117.0809   91.3956  5  2605 |  79.7549   61.5514  4  2100 | Second_flush
124.8864   74.9127  3  3678 | 105.3040   63.9248  3  3236 |  80.4641   50.0434  3  2629 | champagne_dry
119.9944   69.3432  4  3852 | 100.2021   57.5400  4  3438 |  74.2189   43.8381  3  2736 | eleven212
169.4545   89.3858  5  2666 | 140.5230   72.9459  5  2350 |  90.8898   46.1987  5  1550 | Discrepancy
128.6754   80.9999  4  3129 | 109.7834   68.3839  4  2979 |  81.5317   52.7633  4  2279 | Red_Dwarf
143.1666   80.6992  5  2600 |  82.7591   46.5790  5  1520 |  45.6534   24.9822  5   750 | AI_WorldHardest:Everest(2012)
 51.6346   27.2382  3   604 |  44.3559   23.6901  3   558 |  37.7688   20.7097  3   492 | AI_WorldHardest2006Escargot
 30.8243   15.4611  4   310 |  28.7598   14.0280  4   268 |  24.7899   11.2542  3   238 | AI_WorldHardest2010
 65.1211   38.5058  6  1260 |  29.0808   23.6292  2  1780 |  14.7689   12.4974  2   428 | 17Hints35410
 76.9150   57.2583  5  2374 |  48.0631   46.4432  2  3264 |  15.5645   14.4800  2   716 | 17Hints35409
 67.5199   47.5973  1  1644 |  36.8486   26.3998  1  1035 |  43.8726   30.5088  2  2600 | 17Hints2919
 18.1570   11.4057  3   112 |  15.1152    8.8262  3    76 |  15.0943    8.8332  3    76 | 17Hints41826
 25.7975   15.2712  5  1088 |  12.9399    7.6489  2   268 |  15.1081   10.7991  2   508 | 17Hints48126
 64.6332   35.8974  4  1692 |  52.6721   32.4805  4  1244 |  34.4980   21.7344  4   740 | JapaneseSuperComputer322
 85.6295   55.9348  3  2769 |  42.8059   33.9797  2  1114 |   5.5976    3.0191  2    80 | 17Hints35043
 19.1333   10.0587  3   138 |   9.0256    6.1435  2    56 |   9.0112    6.1614  2    50 | 17Hints41164
 69.6821   54.4688  4  1905 |  43.1810   34.5978  2  1202 |   4.4054    1.3747  2    10 | 17Hints35045
 32.7034   20.9270  3   650 |  17.1254   13.7672  2   636 |   6.3354    5.7807  2   112 | 17Hints24147
 35.3365   23.9511  3   852 |  17.8776   14.6783  2   620 |   6.6121    6.6018  2   154 | 17Hints35044
 32.8204   24.0518  4   524 |  19.2850   14.8968  2   618 |   3.7818    1.6706  2    36 | 17Hints35042
 11.2696    7.1177  2    74 |   7.9510    5.7327  1    50 |   8.2605    5.9768  1    50 | 17Hints28653
 26.7631   12.9996  4   300 |  17.5557   11.7127  2   466 |   9.2713    7.2917  2   150 | 17Hints24153
  7.2228    4.8668  1    62 |   5.5842    3.5982  1    18 |   5.5918    3.6060  1    18 | 17Hints18573
  8.9374    4.4667  2    34 |   5.7402    3.7673  2    26 |   5.3602    3.4945  1    32 | 17Hints42464
 12.4693    7.9362  2   166 |   7.0411    6.7048  1   122 |   7.3569    7.4774  1   136 | 17Hints41972
 15.5855    9.0204  3   316 |  14.2533    7.0112  3   210 |   5.0847    3.5962  1    58 | 17Hints32733
 17.3935   10.9517  2   158 |   7.4061    8.5293  1   128 |   9.7625   10.4903  1   142 | 17Hints33052
 20.5692   14.2836  2   206 |   7.9810    9.6946  1   158 |  10.8056   12.0252  1   172 | 17Hints4934
 13.9067    8.1658  2   310 |  12.9607    7.2982  2   234 |   9.3619    4.8879  2   200 | JapaneseSuperComputer313
 35.1865   25.5756  3   722 |  21.1222   15.5642  2   442 |   3.1004    2.4703  1    48 | 17Hints41750
  5.4854    2.8811  2    16 |   5.3650    2.9366  1    18 |   5.3650    2.9366  1    18 | 17Hints12538
  8.8403    4.5058  2    26 |   4.4944    2.3236  1    12 |   4.4944    2.3236  1    12 | 17Hints12995
  7.1596    3.6622  2    92 |   5.4240    2.6639  2    42 |   5.3995    2.9021  1    44 | 17Hints21752
  6.9403    4.2317  2    38 |   6.7366    3.8998  2    34 |   6.7533    3.8503  2    34 | 17Hints26041
  8.1493    3.8598  2    40 |   6.5542    3.8178  1    32 |   5.5294    3.5086  1    32 | 17Hints39923
 17.7356   12.0188  3   202 |   4.0557    1.6322  2     8 |   4.0557    1.6322  2     8 | 17Hints40086
 88.9682   95.2481  2  9832 |  28.3550   31.7482  1  3626 |   1.5000    0.5000  1     2 | 17Hints44580
  6.2596    2.9969  2    22 |   5.1312    2.6582  2    22 |   4.9402    2.6775  2    22 | 17Hints47137
  6.3676    3.1005  2    20 |   4.6196    3.5348  1    20 |   4.9621    3.7229  1    20 | 17Hints48498
  3.9583    2.4953  1    18 |   3.9658    2.5029  1    18 |   3.9725    2.5098  1    18 | 17Hints8780
  3.1147    1.9455  1    10 |   3.0647    1.9504  1    10 |   3.0647    1.9504  1    10 | 17Hints9404
blue
 
Posts: 979
Joined: 11 March 2013

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby tknottet » Mon Mar 02, 2015 11:17 am

denis_berthier
I found that a cause(at least one of causes) that the rating value of Imam Bayildi(minimal version) is smaller than that of its isomorph(non-minimal version) is because my rule15 is not confluent.
I found a situation in solving path of the isomorph where the rule15 should be applied but is not applied.
It causes to raise the isomorph's rating value.
The cause that the rule15 is not applied in case of the isomorph is because my rule8 which is used to create "Off Off Vectors" is not confluent.
tknottet wrote:( at Thu Feb 26, 2015 11:50 pm )
(1-2)After Ca is eliminated, r7(Locked Candidates), r8(Naked Pair/Triple/Quad) and r9(Hidden Pair/Triple/Quad) is applied separately. If Cb is eliminated by any of(not 'oh') these application, "Off Off Vectors" is defined from Ca to Cb.

Which digits are eliminated by rule8 is influenced by number permutation.
The difference is not fatal in usual solving, but it is fatal in application of the rule15.
Hidden Text: Show
The rule8 works as follows:
[at the following row in solving the minimal version]
14,14,134,2389,13479,23789,789
It finds '14,14,134' as Naked Triple, then eliminates 3 in 2389, 134 in 13429, 3 in 23789.
(Before 14 is checked as Naked Pair, 13 is checked as Naked Pair then 134 is checked as Naked Triple.)
[at the corresponding row in solving the isomorph]
34,34,374,2768,37458,27568,568(Digits are placed at the position of corresponding digits above.)
It finds '34,34' as Naked Pair, then eliminates 34 in 374, 34 in 37458.
(34 is checked as Naked Pair, before 374 is checked as Naked Triple.)
Elimination of '3 in 2389' in solving the minimal version( '7 in 2768'in solving the isomorph) is critical in the rule15 application.

For many rules in my T including rule8, when the rule is applied once, the processing is over there. It causes non-confluence. It takes time for the solution to this problem. For a while, I must add some apologize on my hardest sudoku list page.
Thank you very much again for explaining non-confluence phenomenon many times.
tknottet
 
Posts: 24
Joined: 15 February 2015

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby denis_berthier » Mon Mar 02, 2015 12:57 pm

Hi tknottet
Some points remain to be improved but your rating is interesting and raises interesting questions.
Ratings that do not give the same value to isomorphic puzzles are not so rare. It's not dramatic per se (if it doesn't happen too often). SER is well known for having the problem occasionally. Whips and the W rating have the same problem (very rarely in practice), only braids and the B rating are free of it; but whips are so much simpler that the W rating remains my preferred one.
Non confluence can be more of a problem for the programmer if he spends a long time trying to find a bug in the code (he can't find anything in the code).
denis_berthier
2010 Supporter
 
Posts: 3972
Joined: 19 June 2007
Location: Paris

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby denis_berthier » Mon Mar 02, 2015 1:02 pm

Hi blue,
Very interesting and surprising (so large max) results.
I wonder what's the max value when all the candidates are tried.

On my side, I'm running calculations with my T&E procedure to compare the two strategies (random vs bivalue/bilocal preferred)
denis_berthier
2010 Supporter
 
Posts: 3972
Joined: 19 June 2007
Location: Paris

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby eleven » Mon Mar 02, 2015 8:41 pm

When i looked at blue's table, i felt a bit better, because i realized, that the 62 guesses in my Nugget experiment are the worst case. The best case for my selected guessing candidates is 5, and the average 32.5, if i calculated that correctly.

I tried to find out, how 10 guesses could be needed for the last puzzle, but i cannot reach that, if, when a candidate lead to a contradiction, i immediately try the other candidate of the cell. Is that right ?
eleven
 
Posts: 3097
Joined: 10 February 2008

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby blue » Mon Mar 02, 2015 9:03 pm

I've finished trying all sorts of things with bi-local candates, and bi-value/bi-local mixes, and different kinds of filtering.
Everything was run with a "T" that included singles, locked candidates and subsets.
In absolutely every case that involved bi-local candidates, the ratings were worse than with bi-value cells only.
That included even the cases where a bi-local candidate is used only when there are no bi-value cells to choose from.
[ For a couple of the combinations, there were a (quite) small number puzzles in the list, where the rating improved, but the overall effect was never good. ]
The actual ratings, are probably not worth posting.

When I first tried "bi-locals only" (and no bi-value cells), there were so many options to choose from, that the code ran out of RAM before finishing the calculation -- RAM used to store ratings and "good/bad" flags for the intermediate puzzle states. I tried a Monte Carlo rating, at that point, for "Golden Nugget" only ... and it was a little over 2x what it was with "cells only". At that point, I added a filter for the bi-local candiadtes, that rejected cases where the candidates were in the same box and same row or column. That reduced the problem enough so that the code didn't run out of RAM, but the rating for "Golden Nugget" was still aroung 1.5 times what it was with "cells only". After that, though, everything that I tried with bi-locals, included that filter, at least.

The one thing that I did have luck with, was a filter for the "cells only" approach, that ignored cells that were part of a locked set confined to both a box and a line. That lowered most (?) of the ratings by a little bit (Gold nugget went from 82 to 75), and it also cured the problem where for a few puzzles, the rating went up when subsets were added.

==

denis_berthier wrote:Very interesting and surprising (so large max) results.

Yes, I was surprised by that too -- larger than the average by many multiples of the standard deviation !

==

eleven wrote:When i looked at blue's table, i felt a bit better, because i realized, that the 62 guesses in my Nugget experiment are the worst case. The best case for my selected guessing candidates is 5, and the average 32.5, if i calculated that correctly.

This is more or less exactly the opposite of what I was seeing.
Either your filtering method is very good (likeliest ?), or we aren't calculating the same thing.

eleven wrote:I tried to find out, how 10 guesses could be needed for the last puzzle, but i cannot reach that, if, when a candidate lead to a contradiction, i immediately try the other candidate of the cell. Is that right ?

Yes, and you need to count the other candidate as a "guess" ... more or less because you need to apply "T" at that point, and run it to a point where either another guess is required, or the puzzle is solved or showing a contradiction.

P.S.: Here's a sequence of 10, for that last puzzle:
Code: Select all
7r9c5 -> bad
  8r6c1 -> bad
    2r9c9 -> bad
      9r8c4 bad
      5r8c4 -> bad
        7r1c8 bad
        9r1c8 bad
    9r9c9 bad
  9r6c1 bad
9r9c5 good
Last edited by blue on Mon Mar 02, 2015 11:27 pm, edited 1 time in total.
blue
 
Posts: 979
Joined: 11 March 2013

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby eleven » Mon Mar 02, 2015 10:57 pm

Many thanks, blue, for calculating that.
It's not the first time that a sudoku conjecture of me is totally wrong :)

I checked my Nugget notes, and found that i had left a possibility open, which added 7 guesses. So the sum is 69, and the average 37.
Note that i both had used bilocal and bivalue guesses.

Concerning the last puzzle i had totally overlooked the bad choices.


Another question i am very interested in, is how the T&E ratings of puzzles with the same ER in the range 7/8 would differ. Would it be possible e.g. to rate say 100 random puzzles each with ER 7.3 and 8.3 ?
eleven
 
Posts: 3097
Joined: 10 February 2008

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby blue » Tue Mar 03, 2015 2:31 am

eleven wrote:Another question i am very interested in, is how the T&E ratings of puzzles with the same ER in the range 7/8 would differ. Would it be possible e.g. to rate say 100 random puzzles each with ER 7.3 and 8.3 ?

These were the first 100 that I got with ER in that range.
I didn't get a good spread of values ... most are 7.3 or 8.3.
Also most all (if not all of them) have singles in the beginning.

I tried producing some with an ED in that range too, but with no luck (... it would have taken a long long time).
If you have another list that you'ld like me to rate, I'ld be happy to do it.

The 3 "tknottet" ratings are for singles, singles & locked candiates, and singles, locked candiates & subsets.

Hidden Text: Show
Code: Select all
8.3  15.7683   6.1059   5.6902  000028000000003006800000301007010004004000050501004038090480000100607900050000000
8.3  11.5107  10.0839   7.5550  010000600800100005067005000005008060002030074000200300006070000000003400034000210
8.3  11.3926   5.7981   4.7825  200070800009003000007200054100608000000000040050002100006000000095700002020005306
8.3   9.3762   8.4357   7.0732  090010000600004007004700000007000008000030071501000020000920030000600800408001609
8.3   9.3744   7.4865   6.2872  000750910050006000000008203020809000000000000609500001070000032008030070091000400
8.3   9.3734   6.2568   5.5221  080200300204000090006005000000050080870006000500400020007000049000020807000304002
8.3   9.2824   5.9685   4.8302  050000003020009004008000700007320005010050208000100600200840000005000900040600000
8.3   9.1336   9.0551   8.8938  030900018000030900070002000406000000050000100000520000910083024003040007700290000
8.3   9.0724   7.9696   8.8418  050900100078040020100000007002400005890020000005008030006190000300200000000003008
8.3   8.5719   6.6036   6.3218  060100780900000003000760910003280406000000005470000800030012000000008000002090060
8.3   8.1307   7.4515   6.0233  010000000200000040000005306095010002000003007802900000000600700000090018083000400
8.3   7.9939   5.3123   5.3256  080903000000005000005000710709000800000000504030090000058040001600030425000600000
8.3   7.9393   6.2678   5.9258  004005300207080005900000020003400050600000010000703900000600000021000004005020030
8.3   7.8435   7.7886   7.6818  000000600000060031100000059098030000030502040000000000200705400075640020080023000
8.3   7.4207   7.2241   7.2520  000400000005081030070006900492100800107052006000000000510090060740000008026000000
8.3   7.1337   7.6822   6.9227  001300000000070000960400000650000102000000060300000809000031004100902750090605080
8.3   6.9718   5.9319   5.1494  069000080000100036004080005020000100087009000900700000010500009500000000008607400
8.3   6.9405   5.7610   3.9591  000069350300004000090070000830050000005006730000807040001000900040000260200000001
8.3   6.5841   3.9464   3.9386  014690000003002070000000080000000890090001020700080003640309000900020000000000004
8.3   6.4006   5.8891   6.0540  000070809090308000400000000706000204000007080000450007150820000240000000000000190
8.3   6.3094   4.7216   4.6902  903070040750004000000000060010067028008000001340001000800002000025080900000050010
8.3   5.6324   3.8261   3.9311  000000000009700620300900740100003900000600034050000100400000060000005090090002480
8.3   5.4944   5.5064   5.2601  000120000000078042000030000000014080001000030600050070020000000800060507590007060
8.3   5.4534   4.7943   4.7989  003000802020080060000010000689000300000900000070000050005000720000460008060500903
8.3   5.3292   5.5653   4.0547  000000300001403095096000701000000000000120003028000100010070060070805900005000000
8.3   4.9251   4.1997   2.2050  080000020030904060540001000900600070000050900000008000007030600050000000300805001
8.3   4.8651   5.1448   4.7327  000020190008000000000900003001030060080600200002500400000000620200307500047080000
8.3   4.2393   4.1020   3.9413  000006013073000250100000000000600004006051900408090030080007000040000002200003100
8.3   3.6975   4.0077   4.0057  000030090004029307070004080100080000000005004608700002300970008000000000200050700
8.3   3.3387   3.5500   3.5120  005002000000350900840000000700030102002780000500040300000000000206000040003506208
8.3   2.2567   2.2438   2.0484  000020000060000312043000700000460500005000000000007020001200040500070080800095001
8.2   4.7999   4.0046   3.9328  603000000890070206000400050507008000100007004004000030006000070002060010000205000
8.2   4.2815   4.5724   4.1252  600900300300000045540007000070100000020009070000500800000205030000090200090080010
7.9   8.1199   6.6688   5.6801  010008409003067080000000030000500003030019000079800000001600000000000001090340720
7.9   6.3739   5.3942   3.4798  407900000000600034300008000000000005020000860006010007081060020000003070003024600
7.8  10.5846   8.8237   7.8856  000000000900002600503807002000730004050009070002080030270001003008000010000090000
7.8   9.4650   8.5370   8.2201  030009760240000000007030000800000510000000007004705806001004050020800100005007004
7.8   9.0785   8.9586   6.3154  000006320700000004053010006001070800000000000080205107004000500030100000500007601
7.8   7.3935   5.3464   6.9107  000000730017000008004010060080001305400300020000008400005604000023000080000200000
7.7   7.4048   7.3532   6.9394  000009002001000400000050083006002000000080090000403005520078009380000070010000020
7.7   4.0375   4.3044   4.1639  000005800100700060030080004007002005080000000205061000000007150700090000000608000
7.7   3.6799   3.3175   2.7860  040200005972300060010069000430000000100000453600000020000000000000006718003082000
7.6   9.9223   9.2386   9.2996  080200059910400008005000030000080005030000090057900004200034000000800920100000000
7.6   8.8676   8.3648   7.4712  030005600000900000009040030201070000050300068000001700004006012100000900300700050
7.6   7.7347   7.3606   6.2321  000609000070300000009020300000000720086003904400007000010050000000006230800004015
7.6   7.6093   6.6468   5.7805  908000000004000930000090704501400000000081600000050320000002000020000007049670050
7.6   6.2249   5.3074   5.5055  000007306009000200560002000000041000000070103000600059080005002007200900100090007
7.6   5.3262   5.2591   4.7397  060800001030000050008057300000200100540000070000003020020000000400000700903010600
7.6   5.2180   4.6053   4.6200  250400100000000060089060000800107200000000710700023400520000009003008002900200040
7.5   6.1646   5.3582   5.4948  860920000010803000200000007000000684003004000000080905000701000000050006041030009
7.4   5.2995   5.1248   4.7316  000000003708043200002000107001500000090000002645000080000605000050037004006000070
7.3  11.4908   4.0064   3.7095  004010050080000000506027040078300000000000920600000000000600000900040018035200000
7.3  11.4899   3.4402   3.1040  000430060060000830000600150080000510005009000003060200007001080010090700000200000
7.3  10.8264  10.6650  10.3996  006000090000610000708000005300090001090100400020004700000260010000080520800000300
7.3  10.4360  10.1240   8.3036  000009067000002300000830450070100030000500000009024600690040500035000080200000000
7.3   9.9495   8.7277   8.3457  000000800058300200007060000000000003300850600000040050010000040500004060203000108
7.3   9.7544   6.5421   6.2402  040000090000000008900208100500000030034000600000015200206050009051060000000702000
7.3   9.7378   8.2409   8.4335  600950010090007300000030000078010003000000004050308000000060900700005068100400000
7.3   7.4254   6.0889   6.2060  000700000715003406060024010050000000103007000097300000500800967080000050006000001
7.3   7.3074   4.5838   4.5867  000000709800054200040900000015400000007001000400002060061005003000103000080690100
7.3   7.1137   6.1599   3.3740  070200000002001005604900700000006070000000301208000600006010030900040006430098000
7.3   7.0463   4.7582   3.9947  002008000600040700000000000104600900090003800000000506000000010070300095040089600
7.3   6.9434   4.9184   4.6362  804000090090680001100000300400052000008000700710000000200400000003070080006003020
7.3   6.9427   5.7519   4.3940  000040100000007940007200008094002000130000070000000801200009000080500230050700000
7.3   6.9247   6.5970   6.2724  000040090005802300000359004900000000010006002006000800070083506030000000200905000
7.3   6.8778   5.5439   5.2597  093010706005007090000600001501080000006200400040000020604000000000009048070000000
7.3   6.8710   5.2601   4.0994  008000040090810000300000790002001607006000200005000000600450020000090060080006004
7.3   6.4909   3.5057   3.4929  079624000040000090005001000086070400400003020000080070608000200000000100000730009
7.3   6.4638   6.0034   5.2137  970005001000910300200007009000060004000000050840000600600040008180000200000300070
7.3   6.4013   5.2446   4.9722  000009000040020600000650010600000980008070106037000000000093850020008000000067004
7.3   6.0989   5.4256   4.7261  004000009085609000000071000430060000000000006090000002042000097700002605000500040
7.3   6.0637   3.6981   3.7003  000903000030026090007000000000000000000009730200000806076300008001800000040007105
7.3   5.9236   4.6662   5.4526  007090050900500060000070100403950007000007200500008000000310004800000020014000003
7.3   5.9085   5.7815   5.3221  000010605007000080240003000002000013080000004500600700020008006001570000800060000
7.3   5.8731   4.8312   2.7469  000003000001000000800000560004050081500400200000032700702000000430600005010800040
7.3   5.7966   5.7019   4.7693  980000020007040000500301700000000400030500008000204007000800500020059000000600340
7.3   5.6446   5.1450   4.4317  095000003600000000403810000980006000204000006010320700000002900000480007000900180
7.3   5.5984   4.7559   5.3408  000800000000000480150004000006300000920400050700001900000005210061020005000080030
7.3   5.5743   5.2805   5.8338  200760001010000900000200570000140000420008000031500000603000010090800305050004600
7.3   5.3589   5.4587   3.4605  000090650010000240040800070000086090100000020000002007260050000004007300301400000
7.3   5.2314   4.2627   3.8197  307000010000006380060800000500000073800005100010000004000061200040090000100070540
7.3   5.1951   4.0289   3.6737  700805000020000008060020710100040000600000800030600200000039004000004080910050002
7.3   5.0230   4.0223   3.6128  207400800003800000001070049000007000000950006005000400010280500069000200000000907
7.3   4.9846   4.1866   4.0861  000500001009000800205008000160000090008001070050000300600004100007090040000037600
7.3   4.9295   3.5696   3.5659  400300000030021008050000000005870030070002000083060000000000609509607200100000005
7.3   4.7884   4.2397   3.8768  070043000500000006000200007004705201035000040190000000008090000600001003300500080
7.3   4.7511   2.8016   2.4222  090000406008000000000004030100000605980000001507000092000200950720931000000080020
7.3   4.6212   4.3494   4.3487  010040000704030020090806000000020590300407000800010000570000200000005700001004600
7.3   4.5564   4.3575   4.4043  000000003206000700050006910900400000084000105670801000890000000000108600007005800
7.3   4.3540   4.3652   3.8068  060007089029010000700006000400700103000040705210000040150008600008000001076000000
7.3   4.2629   3.8854   3.5747  000003060080070005006580100000000000300004010051020309090810002010009800000000700
7.3   3.9390   3.8649   3.7927  000000430050100000004890000002000709000030000009008040007004250030620000045089000
7.3   3.7949   2.7064   2.7794  004000005720000100900000008500086040030000800001500000800150079000007000000008602
7.3   3.7643   3.5196   3.4053  700090000090701020000345900003027400520000680000000000000006240000034500802000000
7.3   3.5679   3.5637   3.4070  000400700000000069000097100400000058902605070000000000000003005078200001050009340
7.3   3.5413   3.1941   2.8791  000007040005006000000120000002801000007004300050000601029000030080070460073600028
7.3   3.4538   3.2692   3.0253  000002004008000021000060700302900008490003100007005000800300000070040905006010080
7.3   3.4412   3.3249   3.3666  020091000000000700000068009409000000005000020078405000093500000000000001080007430
7.3   3.0158   3.1011   3.0679  260000809001500000000000461090006208400000700080002000000900000800004002010700500
7.3   2.3205   2.6028   2.6442  000080090000007001006040000400001000073500200900000043009003050300090700200000008

Best Regards,
Blue.
blue
 
Posts: 979
Joined: 11 March 2013

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby denis_berthier » Tue Mar 03, 2015 5:05 am


Computational disadvantage of preferring bivalue/bilocal cells in T&E

In this post, I'm not speaking of exactly the same thing as tknottet, eleven or Blue's DFS procedure.
Below, T&E always means the procedure I have defined as T&E (which I’ve always considered as the mere formalisation of what players usually mean by T&E and also of Ruud's original definition in Sudopedia - now disappeared):
- it has no guessing (if a tried candidate leads to a solution, the corresponding path is merely forgotten);
- it tries all the candidates at some depth before going deeper;
- it may try candidates at some depth several times so that the result at this depth is independent of the order they are tried (a necessary condition for the procedure being well defined).

I’ve shown years ago that all the known puzzles can be solved at depth 2 (at most) and Blue has shown that only 1 puzzle in about 30,000,000 requires more than depth 1.

What I compare below is two different strategies for trying all the candidates at some depth: (pseudo) random and bivalue/bilocal first.
Notice that the T&E strategy used has no impact on the resolution state reached at each depth; it can only change the computation times.
As I said above, T&E is very different from ttknottet’s DFS procedure. However, I think the comparisons below about the preference for bivalue/bilocal cells, valid for my T&E definition, should me more or less valid for DFS.

The results below must be taken with some grain of salt:
- they are implementation-dependent (they were obtained with the latest version of SudoRules, based on my general pattern-based CSP-Rules solver)
- they rely on small samples
- my T&E procedure is generic (it works for any CSP) but it is very far from being optimised for Sudoku
- all the timings are relative to the first cell of each table, arbitrarily set to 1
- in the first two samples, many puzzles can be solved without T&E
- the cb collection (controlled-bias) is the closest we can get to a random unbiased one (I didn’t consider worth doing the corrections for the known bias); all the puzzles are in T&E(1)
- the magictour-top234 collection was once considered as the hardest, but that was long ago; it remains among the hardest puzzles solvable by human brains; no puzzle in it can be solved without T&E (i.e. with only the rules in BRT or W1) but almost all are in T&E(1) and all are in gT&E(1)
- the eleven’s-hardest collection is version V2_26370.tx of eleven’s collection of puzzles not in T&E(1, S4) i.e. not solvable by S4-braids; I consider only the first 100, i.e. the hardest; no puzzle is in T&E(1)

For each collection, I used only T&E(T, 1), i.e. no T&E level deeper than 1. It implies that a few puzzles may be only partly solved, but it doesn’t matter for the comparison, as the 2 strategies lead to the same final resolution states (T fixed).
The only exception is eleven’s collection, for which I also tried T&E(T, 2). In this case, all the puzzles are solved.

Notations:
BRT = universal Basic Resolution Theory (= Singles + ECP)
W1 = BRT + whips[1] (= claiming and pointing)
bivalue-first: try first any candidate in a bivalue/bilocal cell
The format for each cell is: mean time - max time (each value is relative to first cell)

Code: Select all
Rules in T                     BRT                    BRT                            
T&E strategy                   random                 bivalue-first                
Royle17-1-100                  1 - 1                  0.96 - 0.91                   
cb-1-100                       1.06 - 1.68            1.02 - 1.64                                      
magictour-top234               6.26 - 13              6.61 - 15              
eleven’s-hardest-0-100 (1)     9.07 - 7.18            9.20 - 7.27                 
eleven’s-hardest-0-100 (2)     73.2 - 144.9           76.7 - 146.3                 

Comment: there is no significant difference between the two strategies for T&E


Now we add whips[1] to T. The final states are still the same for the 2 strategies, but possibly different from above (closer to solution)

Code: Select all
Rules in T                     W1                    W1
T&E strategy                   random                bivalue-first
Royle17-1-100                  1 - 1                 0.95 - 0.6
cb-1-100                       1.11 - 1.33           1.07 - 0.94                   
magictour-top234               10.8 - 19.2           6.26 - 8.9
eleven’s-hardest-0-100 (1)     55 - 34.3             8.4 - 4.2
eleven’s-hardest-0-100 (2)     185.8 - 319.4         591 - 708

Comments:
I was surprised by the results for gT&E = T&E(W1). Preference for the bivalue/bilocal candidates seems to give faster calculations at depth 1 for the hard puzzles (no full solution found at this depth), but slower at depth 2. (I checked the results 3 times.)
For puzzles that can be solved by T&E (almost all the puzzles), using gT&E instead leads to a slower procedure.
For puzzles needing T&E(2), the bivalue strategy is much slower for gT&E(2). It is faster for gT&E(1) (producing only partial results).

Globally, I conclude that the bivalue/bilocal strategy is rarely good, confirming the impression I had when I watched hundreds of resolution paths for hard puzzles: long streaks of whips with no Single in between (in SudoRules, an elimination for a bivalue would be immediately followed by a Single).
I didn't try using only bivalue candidates instead of bivalue/bilocal.
denis_berthier
2010 Supporter
 
Posts: 3972
Joined: 19 June 2007
Location: Paris

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby dobrichev » Tue Mar 03, 2015 9:23 am

blue wrote:...If you have another list that you'ld like me to rate, I'ld be happy to do it.

Hi Blue,
Are there rating anomalies in these puzzles?
Code: Select all
... ... ..1
... ..2 .3.
..4 .5. 6..

... .6. 57.
..5 ..8 4.6
.9. .47 .8.

..3 1.. ...
51. ... ...
64. .3. ... 9.6/1.2/1.2
This is from here.
Code: Select all
000400700050100000080030100000360090091000840000000000700090060004020000005800030   22   4 7.1/2.3/2.3
000400700050000000080030100000360090091000840000008000700590460030620000065800030   26   4 7.1/1.2/1.2
12.4..7...5.1......8..3.1.....36..9..91...84....9.8...71.59346..3462.....658...3. # 98896 FNBTHG C33/M4.2047.259
They are from here.
dobrichev
2016 Supporter
 
Posts: 1850
Joined: 24 May 2010

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby blue » Tue Mar 03, 2015 10:34 am

dobrichev wrote:Are there rating anomalies in these puzzles?

Hi Mladen,

In the first one, yes, but it's removable with the "no cells in locked sets in a mini-line" filter.
The last one shows something new -- a higher rating with locked candidates than with singles only !

Code: Select all
singles | +locked |+subsets | subsets
        |  cands. |         | +filter
--------+---------+---------+--------
95.7187 | 80.7856 | 93.1054 | 78.3496   min = 5,5,5,5
20.3089 | 18.5035 | 13.5272 | 12.3411   min = 4,4,4,4
 9.7053 |  8.5848 |  8.1214 |  7.8492   min = 4,4,4,4
 6.7648 |  6.9055 |  6.8470 |  6.6156   min = 4,4,4,4


--------------

P.S: I hadn't though to apply the filter to cases where the solver wasn't using subsets, since the solver wouldn't be turning hidden pairs (& triples) into naked pairs (& triples) just before each guess.

I've done that now, and the results are below.
The numbers in brackets are with the filter applied.
The last one seems OK now, but the first one shows an anomoly again !
Interesting.

Code: Select all
     singles     |  +locked cands.  |    +subsets
-----------------+------------------+-----------------
95.7187[80.0832] | 80.7856[66.5267] | 93.1054[78.3496]
20.3089[18.5035] | 18.5035[17.1625] | 13.5272[12.3411]
 9.7053[ 8.5848] |  8.5848[ 8.2976] |  8.1214[ 7.8492]
 6.7648[ 6.9055] |  6.9055[ 6.6464] |  6.8470[ 6.6156]
blue
 
Posts: 979
Joined: 11 March 2013

PreviousNext

Return to General