Puzzles Usage in a Presentation of a New Rating of Puzzles

Everything about Sudoku that doesn't fit in one of the other sections

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby eleven » Fri Feb 27, 2015 4:00 pm

blue wrote:I'm thinking now, that instead, when I added subsets into the mix, it caused a lot of naked pairs to appear in the intermediate ("T-quiescent") states. I think it's usually a bad idea to choose a "guess" cell in a naked pair, when other bi-value cells are available.

This seems to be a good explanation.

As i said, i would prefer to test bilocation candidates instead of bivalue candidates. Now i should add: bilocation candidates which are not pairs (i guess that always one exists).
The more candidates the 2 cells have, the better.
E.g. in 17Hints2919 i tried the strong link for 7 in r1 and it quickly finished.
eleven
 
Posts: 3151
Joined: 10 February 2008

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby blue » Fri Feb 27, 2015 8:57 pm

eleven wrote:As i said, i would prefer to test bilocation candidates instead of bivalue candidates. Now i should add: bilocation candidates which are not pairs (i guess that always one exists).
The more candidates the 2 cells have, the better.
E.g. in 17Hints2919 i tried the strong link for 7 in r1 and it quickly finished.

I've been curious about how that would work too.
Bi-local candidates that are part of a played out X-Wing, are probably not good either.

When I finish up with the other things I'm doing, I'll run set things up to use either bi-local candiadtes only, or bi-local candiates and bi-value cells, and tri-local/tri-value things, only when necessary. [ I won't try filtering out naked pairs or X-wings. At least not yet anyway. ]
blue
 
Posts: 1045
Joined: 11 March 2013

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby eleven » Fri Feb 27, 2015 9:49 pm

blue wrote:Bi-local candidates that are part of a played out X-Wing, are probably not good either.

Funny, i just wanted to add this.
Awaiting your results patiently.
eleven
 
Posts: 3151
Joined: 10 February 2008

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby blue » Sat Feb 28, 2015 2:09 am

tknottet wrote:Would you kindly try completely random T&E?
(...)

tknottet wrote:blue
blue wrote:I'll see how it goes, but I'm sure I'll run into to problems trying the highest rated puzzles.

Thank you very much. I am interested to the highest rated puzzles more。
I was only anxious about the processing time, when I wrote "even for relatively not-hard puzzle".

tknottet wrote:I had forgotten important thing.
To shorten calculation time, my rating code works as follows,
In case
Pxy is created by y-assumption after x-assumption.
(from P1, P1(x) is created P1=P1(x) for recursion; from P1, P1(y) is created)
Pyx is created by x-assumption after y-assumption.
My rating assume that R(Pxy)=R(Pyx).If R(Pxy) is already calculated, R(Pyx) is not calculated. R(Pxy) is used in place of R(yx).

blue wrote:I can try producing actual resolution paths too, and see how that works -- like the "1,000,000" paths I was using in my "bug hunt" (only fewer !).

I used something similar to that last idea of yours, and for the high rated puzzles I ran out of RAM, long before running out of patience.
Without that, the times would have been too long for me, even before switching to "completely random T&E".

I did get exact ratings for all but one of the puzzles that were showing the largest increase in ratings, as more solving techiniques were added.
For the rest, I used the "Monte Carlo" method to estimate them -- with 100,000 (random) resolution paths per puzzle.
The results are below, with the +/- error estimates ("95% confidence intervals") in [] brackets.
The entries with "[ ---- ]" for the error estimates, are exact.

Your conjecture was right ... the ratings do decrease now, as more techniques are added.
They're also much larger than before ... which was to be expected.

[ A warning: The table has 48*4 = 192 entries, with "95% confidence" intervals listed.
That really does mean that 5%, or around 10 of the values are expected to be off by more than amount estimated.
(By Murphy's Law then, it will be more like 15 or 20 :D) ]

Code: Select all
                s               lcs             lslcs           flslcs  | puzzle
------------------------------------------------------------------------+------------------------------
2837.6411[11.0697] 1732.7576[6.9854] 1257.5863[5.1746] 221.4512[4.9982] | Suexrat9-10364
1045.8720[ 4.1418]  657.3957[2.6455]  422.0471[1.7565] 411.8501[1.7086] | Fata_Morgana
 350.9959[ 1.4117]  279.4144[1.1337]  205.3841[0.8545] 193.9415[0.8114] | Easter Monster
------------------------------------------------------------------------+------------------------------
 506.4876[ 2.0044]  371.4171[1.4854]  254.8834[1.0514] 249.2565[1.0212] | GoldenNugget
 551.8360[ 2.2459]  401.1752[1.6540]  272.3358[1.1533] 265.0773[1.1193] | Kolk
 395.5535[ 1.5409]  252.1849[0.9893]  152.4844[0.6212] 143.8617[0.5784] | Patience
 471.2459[ 1.9059]  330.2747[1.3639]  241.2798[1.0099] 234.7850[0.9793] | Imam_bayildi
 457.0100[ 1.8400]  342.7118[1.3971]  254.5702[1.0657] 244.0285[1.0119] | Second_flush
 395.0741[ 1.6137]  302.9497[1.2502]  221.0666[0.9211] 213.5680[0.8898] | champagne_dry
 390.1755[ 1.5470]  299.4366[1.1979]  216.3965[0.8769] 209.6792[0.8398] | eleven212
 495.4484[ 1.8840]  328.3231[1.2643]  175.4886[0.7181] 170.9934[0.6932] | Discrepancy
 336.5130[ 1.3306]  254.9260[1.0206]  180.7444[0.7380] 175.7364[0.7127] | Red_Dwarf
 411.2148[ 1.6271]  224.2419[0.9205]  124.2704[0.5281] 118.7662[0.5055] | AI_WorldHardest:Everest(2012)
 107.4602[ 0.4332]   84.6703[0.3465]   66.3755[0.2745]  62.7662[0.2558] | AI_WorldHardest2006Escargot
  56.5633[ 0.2263]   49.2721[0.1960]   40.0939[0.1555]  38.7018[0.1490] | AI_WorldHardest2010
 221.8977[ 1.0418]   73.5807[0.3977]   23.5380[0.1513]  23.4970[0.1505] | 17Hints35410
 310.0595[ 1.4770]  102.8108[0.5646]   27.1209[0.1795]  26.7314[0.1771] | 17Hints35409
 204.7902[ 1.4726]   97.5384[0.7493]   67.0491[0.5355]  66.1967[0.5279] | 17Hints2919
  37.3773[ 0.1880]   26.9272[0.1307]   25.9820[0.1244]  25.1625[0.1174] | 17Hints41826
 230.1841[ 1.2807]   61.0685[ ---- ]   49.6120[ ---- ]  48.9874[ ---- ] | 17Hints48126
 182.6256[ 1.0085]  116.7759[0.6505]   49.7271[0.3177]  49.0949[0.3156] | JapaneseSuperComputer322
 176.6585[ 0.7989]   54.2425[0.2830]   14.7182[0.0897]  14.6382[0.0886] | 17Hints35043
  19.9218[ 0.0857]    9.7945[0.0428]    9.3973[0.0402]   9.3930[0.0401] | 17Hints41164
 176.4659[ 0.8091]   59.0370[0.3137]   16.1876[0.1007]  16.0704[0.0993] | 17Hints35045
  89.1453[ 0.4328]   37.7059[0.2039]   13.4603[0.0869]  13.0512[0.0828] | 17Hints24147
  90.6854[ 0.4408]   38.4847[0.2040]   14.4302[0.0905]  14.0332[0.0868] | 17Hints35044
  99.3991[ 0.4895]   39.8821[0.2167]   15.9305[0.1067]  15.4507[0.1005] | 17Hints35042
  13.4591[ ----- ]    9.1834[ ---- ]    9.0880[ ---- ]   9.0573[ ---- ] | 17Hints28653
  69.0613[ 0.3293]   24.8517[0.1314]    9.5955[0.0533]   9.5191[0.0528] | 17Hints24153
  21.2028[ 0.1217]    6.0441[0.0258]    6.0254[0.0256]   5.8568[0.0246] | 17Hints18573
   9.9436[ 0.0408]    6.2056[0.0291]    5.4081[0.0254]   4.7402[0.0213] | 17Hints42464
  26.3757[ ----- ]   18.2355[ ---- ]   17.6096[ ---- ]  17.6034[ ---- ] | 17Hints41972
  60.2300[ 0.3178]   37.5729[0.1932]    6.7376[0.0308]   6.7693[0.0309] | 17Hints32733
  34.7399[ ----- ]   19.8533[ ---- ]   18.4648[ ---- ]  18.4279[ ---- ] | 17Hints33052
  37.0379[ ----- ]   21.7452[ ---- ]   20.7196[ ---- ]  20.6797[ ---- ] | 17Hints4934
  29.4988[ 0.1478]   25.0467[0.1250]   18.4737[0.0903]  18.4435[0.0898] | JapaneseSuperComputer313
  82.4777[ 0.4134]   29.5126[0.1628]    4.6890[0.0243]   4.6606[0.0240] | 17Hints41750
   5.5031[ 0.0202]    5.3001[0.0192]    5.2866[0.0192]   5.1425[0.0188] | 17Hints12538
   9.5918[ 0.0397]    5.7501[0.0249]    5.7398[0.0249]   5.6575[0.0237] | 17Hints12995
  17.4634[ 0.0884]    9.9103[0.0508]    9.5671[0.0488]   9.5296[0.0485] | 17Hints21752
   7.1424[ 0.0292]    6.9345[0.0281]    6.8951[0.0279]   6.5610[0.0259] | 17Hints26041
   9.3131[ 0.0421]    6.3814[0.0282]    6.1177[0.0273]   6.1000[0.0270] | 17Hints39923
  14.8652[ 0.0734]    3.8479[0.0106]    3.8215[0.0104]   3.8215[0.0104] | 17Hints40086
 583.6611[ 3.8938]  196.6350[1.5525]   62.3326[0.5619]  59.7163[0.5383] | 17Hints44580
   7.5965[ 0.0299]    5.7090[0.0240]    5.6149[0.0240]   5.6052[0.0239] | 17Hints47137
   7.8469[ 0.0271]    5.8220[0.0251]    5.7412[0.0247]   5.7303[0.0244] | 17Hints48498
   4.9252[ 0.0194]    4.8064[0.0188]    4.6547[0.0178]   4.6437[0.0178] | 17Hints8780
   5.5584[ 0.0267]    5.2700[0.0250]    5.2962[0.0250]   5.2413[0.0246] | 17Hints9404
blue
 
Posts: 1045
Joined: 11 March 2013

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby tknottet » Sat Feb 28, 2015 2:59 am

denis_berthier
I'm sory, I made a mistake again.
tknottet wrote:Then the next problem.
denis_berthier wrote:do you have a rule that can be applied in some case to produce an elimination of some candidate x, and such that no rule in your set could still eliminate x if another candidate y had been eliminated (or asserted) before?

I'm afraid that each of r10-r15 has this bad property.
Because if y breaks link-chain or vector-chain, there is not a guarantee to be able to eliminate x.
In fact, at the above mentioned level, T founds a contradiction for Pc, but doesn't find for Pa.
That causes difference between two ratings.

I cannot think that rule15 have this bad property.
(Because r15 means row 15th, I use rule1 to rule15 from now on.)
(Do you have a name for "this bad property"?)
Because
(A)If a candidate in the on-on-vector chains is eliminated, the focused candidate (the root of
the on-on-vector chains) is eliminated by tracing the on-on-vector reversely.
(B)If a candidate in the on-on-vector chains is asserted, the end candidate is asserted, then the other end candidate is eliminated, then the focused candidate is eliminated.
I'll continue investigation.

blue and eleven, your discussion is interesting.
eleven wrote:The more candidates the 2 cells have, the better.
E.g. in 17Hints2919 i tried the strong link for 7 in r1 and it quickly finished.

I uploaded an illustration of first P1 of 17Hints2919 with some strong-links.
http://www.tknottet.sakura.ne.jp/Test/P1of17Hints2919.jpg
I found a very large strong-link network including some candidates in r1.
Do you mean choosing a "guess" cell from "candidates(means adjacent pair for each network) in the maximum strong-link network(s)"?
It will help me to decrease computing time for rating.

blue,thank you very much for your report.
It pleased me very much.
blue wrote:
tknottet wrote:Would you kindly try completely random T&E?
(...)

Your conjecture was right ... the ratings do decrease now, as more techniques are added.
They're also much larger than before ... which was to be expected.
tknottet
 
Posts: 24
Joined: 15 February 2015

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby denis_berthier » Sat Feb 28, 2015 4:52 am

blue wrote:I think it's usually a bad idea to choose a "guess" cell in a naked pair, when other bi-value cells are available.

eleven wrote:As i said, i would prefer to test bilocation candidates instead of bivalue candidates. Now i should add: bilocation candidates which are not pairs (i guess that always one exists).
The more candidates the 2 cells have, the better.

When you observe the resolution paths of hard puzzles, you can see that Singles are interspersed among long streaks of whip/braid eliminations. It means that, in such puzzles, T&E eliminations rarely occur in bivalue/bilocal cells (where it would be followed by a Single) and the strategy consisting of assigning higher priority to test candidates in such cells is a bad one.
However, I have no data that could lead me to say that it can be still worse to choose them from (Naked, Hidden or SuperHidden) Pairs. I can imagine that an elimination in a Pair doesn't lead very far in general, because a Pair is a very local pattern. But do you have any stronger evidence?

All this leaves open the question of the best strategy for T&E. Until now, I've always used a purely random choice among all the remaining candidates. It's interesting if your analyses lead to a better choice.
eleven, I don't understand your last sentence above: are you suggesting that the opposite strategy (choosing from cells with the largest number of candidates) is better (when dealing with bilocal candidates)? Also, do you have any reason to make a difference between bivalue and bilocal candidates.
denis_berthier
2010 Supporter
 
Posts: 4213
Joined: 19 June 2007
Location: Paris

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby denis_berthier » Sat Feb 28, 2015 5:23 am

blue wrote:the ratings do decrease now, as more techniques are added.

Code: Select all
                s               lcs             lslcs           flslcs  | puzzle
------------------------------------------------------------------------+------------------------------
2837.6411[11.0697] 1732.7576[6.9854] 1257.5863[5.1746] 221.4512[4.9982] | Suexrat9-10364
1045.8720[ 4.1418]  657.3957[2.6455]  422.0471[1.7565] 411.8501[1.7086] | Fata_Morgana
 350.9959[ 1.4117]  279.4144[1.1337]  205.3841[0.8545] 193.9415[0.8114] | Easter Monster
------------------------------------------------------------------------+------------------------------
 506.4876[ 2.0044]  371.4171[1.4854]  254.8834[1.0514] 249.2565[1.0212] | GoldenNugget
 551.8360[ 2.2459]  401.1752[1.6540]  272.3358[1.1533] 265.0773[1.1193] | Kolk
 395.5535[ 1.5409]  252.1849[0.9893]  152.4844[0.6212] 143.8617[0.5784] | Patience
 471.2459[ 1.9059]  330.2747[1.3639]  241.2798[1.0099] 234.7850[0.9793] | Imam_bayildi
 457.0100[ 1.8400]  342.7118[1.3971]  254.5702[1.0657] 244.0285[1.0119] | Second_flush
 395.0741[ 1.6137]  302.9497[1.2502]  221.0666[0.9211] 213.5680[0.8898] | champagne_dry
 390.1755[ 1.5470]  299.4366[1.1979]  216.3965[0.8769] 209.6792[0.8398] | eleven212
 495.4484[ 1.8840]  328.3231[1.2643]  175.4886[0.7181] 170.9934[0.6932] | Discrepancy
 336.5130[ 1.3306]  254.9260[1.0206]  180.7444[0.7380] 175.7364[0.7127] | Red_Dwarf
 411.2148[ 1.6271]  224.2419[0.9205]  124.2704[0.5281] 118.7662[0.5055] | AI_WorldHardest:Everest(2012)
 107.4602[ 0.4332]   84.6703[0.3465]   66.3755[0.2745]  62.7662[0.2558] | AI_WorldHardest2006Escargot
  56.5633[ 0.2263]   49.2721[0.1960]   40.0939[0.1555]  38.7018[0.1490] | AI_WorldHardest2010
 221.8977[ 1.0418]   73.5807[0.3977]   23.5380[0.1513]  23.4970[0.1505] | 17Hints35410
 310.0595[ 1.4770]  102.8108[0.5646]   27.1209[0.1795]  26.7314[0.1771] | 17Hints35409
 204.7902[ 1.4726]   97.5384[0.7493]   67.0491[0.5355]  66.1967[0.5279] | 17Hints2919
  37.3773[ 0.1880]   26.9272[0.1307]   25.9820[0.1244]  25.1625[0.1174] | 17Hints41826
 230.1841[ 1.2807]   61.0685[ ---- ]   49.6120[ ---- ]  48.9874[ ---- ] | 17Hints48126
 182.6256[ 1.0085]  116.7759[0.6505]   49.7271[0.3177]  49.0949[0.3156] | JapaneseSuperComputer322
 176.6585[ 0.7989]   54.2425[0.2830]   14.7182[0.0897]  14.6382[0.0886] | 17Hints35043
  19.9218[ 0.0857]    9.7945[0.0428]    9.3973[0.0402]   9.3930[0.0401] | 17Hints41164
 176.4659[ 0.8091]   59.0370[0.3137]   16.1876[0.1007]  16.0704[0.0993] | 17Hints35045
  89.1453[ 0.4328]   37.7059[0.2039]   13.4603[0.0869]  13.0512[0.0828] | 17Hints24147
  90.6854[ 0.4408]   38.4847[0.2040]   14.4302[0.0905]  14.0332[0.0868] | 17Hints35044
  99.3991[ 0.4895]   39.8821[0.2167]   15.9305[0.1067]  15.4507[0.1005] | 17Hints35042
  13.4591[ ----- ]    9.1834[ ---- ]    9.0880[ ---- ]   9.0573[ ---- ] | 17Hints28653
  69.0613[ 0.3293]   24.8517[0.1314]    9.5955[0.0533]   9.5191[0.0528] | 17Hints24153
  21.2028[ 0.1217]    6.0441[0.0258]    6.0254[0.0256]   5.8568[0.0246] | 17Hints18573
   9.9436[ 0.0408]    6.2056[0.0291]    5.4081[0.0254]   4.7402[0.0213] | 17Hints42464
  26.3757[ ----- ]   18.2355[ ---- ]   17.6096[ ---- ]  17.6034[ ---- ] | 17Hints41972
  60.2300[ 0.3178]   37.5729[0.1932]    6.7376[0.0308]   6.7693[0.0309] | 17Hints32733
  34.7399[ ----- ]   19.8533[ ---- ]   18.4648[ ---- ]  18.4279[ ---- ] | 17Hints33052
  37.0379[ ----- ]   21.7452[ ---- ]   20.7196[ ---- ]  20.6797[ ---- ] | 17Hints4934
  29.4988[ 0.1478]   25.0467[0.1250]   18.4737[0.0903]  18.4435[0.0898] | JapaneseSuperComputer313
  82.4777[ 0.4134]   29.5126[0.1628]    4.6890[0.0243]   4.6606[0.0240] | 17Hints41750
   5.5031[ 0.0202]    5.3001[0.0192]    5.2866[0.0192]   5.1425[0.0188] | 17Hints12538
   9.5918[ 0.0397]    5.7501[0.0249]    5.7398[0.0249]   5.6575[0.0237] | 17Hints12995
  17.4634[ 0.0884]    9.9103[0.0508]    9.5671[0.0488]   9.5296[0.0485] | 17Hints21752
   7.1424[ 0.0292]    6.9345[0.0281]    6.8951[0.0279]   6.5610[0.0259] | 17Hints26041
   9.3131[ 0.0421]    6.3814[0.0282]    6.1177[0.0273]   6.1000[0.0270] | 17Hints39923
  14.8652[ 0.0734]    3.8479[0.0106]    3.8215[0.0104]   3.8215[0.0104] | 17Hints40086
 583.6611[ 3.8938]  196.6350[1.5525]   62.3326[0.5619]  59.7163[0.5383] | 17Hints44580
   7.5965[ 0.0299]    5.7090[0.0240]    5.6149[0.0240]   5.6052[0.0239] | 17Hints47137
   7.8469[ 0.0271]    5.8220[0.0251]    5.7412[0.0247]   5.7303[0.0244] | 17Hints48498
   4.9252[ 0.0194]    4.8064[0.0188]    4.6547[0.0178]   4.6437[0.0178] | 17Hints8780
   5.5584[ 0.0267]    5.2700[0.0250]    5.2962[0.0250]   5.2413[0.0246] | 17Hints9404

The ratings do decrease when rules are added, BUT not by orders of magnitude, except in rare cases. They probably don't decrease enough to counterbalance (in the mean) the increase in computation times. Do you have any data about it?

blue wrote:They're also much larger than before ... which was to be expected.

It's a fact, alright. But, as the number of candidates tried at each T&E level (for a fixed T) doesn't count in the rating, I don't see how the "much larger" could be expected. [The only difference is that you are now trying all the candidates instead of only those in min size cells)?]

What's remarkable also in your table is the very narrow confidence intervals for all the puzzles (I understand the numbers in square brackets as the half-lengths). It makes it all the more interesting to have the min and max values.
denis_berthier
2010 Supporter
 
Posts: 4213
Joined: 19 June 2007
Location: Paris

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby tknottet » Sat Feb 28, 2015 7:23 am

tknottet wrote:blue and eleven, your discussion is interesting.
eleven wrote:The more candidates the 2 cells have, the better.
E.g. in 17Hints2919 i tried the strong link for 7 in r1 and it quickly finished.

I uploaded an illustration of first P1 of 17Hints2919 with some strong-links.
http://www.tknottet.sakura.ne.jp/Test/P1of17Hints2919.jpg
I found a very large strong-link network including some candidates in r1.
Do you mean choosing a "guess" cell from "candidates(means adjacent pair for each network) in the maximum strong-link network(s)"?
It will help me to decrease computing time for rating.

I misunderstood.
This is not what eleven said, but my new proposal.
How do you think about it ?
tknottet
 
Posts: 24
Joined: 15 February 2015

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby JasonLion » Sat Feb 28, 2015 1:43 pm

denis_berthier wrote:All this leaves open the question of the best strategy for T&E. Until now, I've always used a purely random choice among all the remaining candidates. It's interesting if your analyses lead to a better choice.
eleven, I don't understand your last sentence above: are you suggesting that the opposite strategy (choosing from cells with the largest number of candidates) is better (when dealing with bilocal candidates)? Also, do you have any reason to make a difference between bivalue and bilocal candidates.

The "best" choice for T&E depends on your criteria for "best". In brute force solvers the important criteria is ruling out as much of the search space as possible with each guess. Choosing a candidate in a bivalve cell gives you good odds of eliminating half of the search with only a little work. Choosing a bilocation candidate with many other candidates in the same cells likewise gives you good odds for eliminating several candidates at once with little work, which also narrows the search space more quickly than a truly random candidate choice.

The fastest brute force solvers all currently use bivalve cells. Bivalve cells are easy to spot and narrow the search space quickly. Bilocation appears to have a potential speed advantage due to the possibility of more rapid narrowing of the search space. However, no one has published a solver with a quick enough way to locate an appropriate pair of bilocation candidates. For the moment, the extra search overhead appears to out weigh the potential search space narrowing advantage. However, that might just be a historical artifact. It isn't clear if bilocation has had enough investigation to reveal it's true potential.
User avatar
JasonLion
2017 Supporter
 
Posts: 642
Joined: 25 October 2007
Location: Silver Spring, MD, USA

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby denis_berthier » Sat Feb 28, 2015 3:20 pm

JasonLion wrote:The fastest brute force solvers all currently use bivalve cells.
[...]
It isn't clear if bilocation has had enough investigation to reveal it's true potential.

Has there been any comparison with solvers that use all the candidates, with no preference?
denis_berthier
2010 Supporter
 
Posts: 4213
Joined: 19 June 2007
Location: Paris

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby JasonLion » Sat Feb 28, 2015 3:41 pm

Yes, we tried that. Picking a candidate at (semi) random is much worse in terms of total time/steps to solve. This makes intuitive sense. A randomly selected candidate has roughly a 1/3 chance of being true (actually I believe it is a little less than that). If you eliminate it as a possibility you have only eliminated 1/3 of the search space from consideration. In a bivalve cell each candidate has a 1/2 chance of being true, so an eliminated candidate cuts off 1/2 of the search space. I never tried to prove it, but the empirical results were very clear.
User avatar
JasonLion
2017 Supporter
 
Posts: 642
Joined: 25 October 2007
Location: Silver Spring, MD, USA

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby denis_berthier » Sat Feb 28, 2015 3:58 pm

JasonLion wrote:Yes, we tried that. Picking a candidate at (semi) random is much worse in terms of total time/steps to solve. This makes intuitive sense. A randomly selected candidate has roughly a 1/3 chance of being true (actually I believe it is a little less than that). If you eliminate it as a possibility you have only eliminated 1/3 of the search space from consideration. In a bivalve cell each candidate has a 1/2 chance of being true, so an eliminated candidate cuts off 1/2 of the search space. I never tried to prove it, but the empirical results were very clear.

OK for the empirical results (considering that they are highly implementation-dependent, e.g. wrt the bivalue/bilocal difference). Has there been any study of how the empirical results depend on the SER of the puzzles?

As for intuition, it is not so obvious:
- bivalue/bilocal cell : success of T&E (i.e. candidate elimination) = 1/2 but subsequent assertions and eliminations by Singles and ECP
- n-candidate cell : success of T&E (i.e. candidate elimination) = (n-1)/n
denis_berthier
2010 Supporter
 
Posts: 4213
Joined: 19 June 2007
Location: Paris

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby JasonLion » Sat Feb 28, 2015 5:32 pm

I don't have any records of the break downs by puzzle type for that particular optimization. I remember thinking of it as a big improvement across the board, but that is both an old memory and fairly vague.
User avatar
JasonLion
2017 Supporter
 
Posts: 642
Joined: 25 October 2007
Location: Silver Spring, MD, USA

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby blue » Sun Mar 01, 2015 2:08 am

denis_berthier wrote:The ratings do decrease when rules are added, BUT not by orders of magnitude, except in rare cases. They probably don't decrease enough to counterbalance (in the mean) the increase in computation times. Do you have any data about it?

That's the way it is for my code -- times doubled with locked candiates and subsets -vs- with singles only.

denis_berthier wrote:
blue wrote:They're also much larger than before ... which was to be expected.

It's a fact, alright. But, as the number of candidates tried at each T&E level (for a fixed T) doesn't count in the rating, I don't see how the "much larger" could be expected. [The only difference is that you are now trying all the candidates instead of only those in min size cells)?]

For the method that tknottet asked me to use, whenever a guess was required, the code was to choose a random cell and a random candidate order for the cell, and try each candidate until one of them (if ever) produced a solution. That would mean that if the initial puzzle state is inconsistent, then all candidates in the cell are tried, and if the initial state is solvable, then (on average) half of them are tried. There's also a compounding effect, If during that process, additional levels of guessing are required.

I'm still planning the "min/max" and "mean/std. dev." thing, and some tests with bi-local candiates. I'm having a little trouble with the math for the standard deviations, though, and I've got some other obilgations to deal with (including sleep), for the next 16hrs or so :(

Cheers.
blue
 
Posts: 1045
Joined: 11 March 2013

Re: Puzzles Usage in a Presentation of a New Rating of Puzzl

Postby dobrichev » Sun Mar 01, 2015 10:35 am

JasonLion wrote:However, no one has published a solver with a quick enough way to locate an appropriate pair of bilocation candidates. For the moment, the extra search overhead appears to out weigh the potential search space narrowing advantage. However, that might just be a historical artifact. It isn't clear if bilocation has had enough investigation to reveal it's true potential.

Hi Jason,
You might experiment with fsss2, look at fsss2::findBiPositionDigit(int& digit, int& cell) at the bottom of fsss2.cpp file.
I learned from Brian Turner's solver that choosing the topmost bivalue cell for T&E works measurable (~ 5 - 15%) slower than choosing pseudo-random bivalue cell.

My personal preference for universal and most natural rating system is a deterministic estimation of T&E work after singles - something close to suexrat and to proposed by tknottet method but with T limited to singles. That is based on my assumption that all solving methods beyond singles are more or less hidden forms of T&E.
dobrichev
2016 Supporter
 
Posts: 1863
Joined: 24 May 2010

PreviousNext

Return to General