Team project: C or C++ Explainer-like rating program

Programs which generate, solve, and analyze Sudoku puzzles

Re: Team project: C or C++ Explainer-like rating program

Postby daj95376 » Tue Nov 30, 2010 6:03 pm

champagne wrote:I hope to stay with an improvement in performance somewhere between 1 to 10 and 1 to 100. The key point remains to see if ratings will match.

How about a specific performance rating (against SE) on puzzles with a 6.5 rating or lower?

Does your use of "parallel" mean the same thing as performing numerous equally-rated steps concurrently?
daj95376
2014 Supporter
 
Posts: 2624
Joined: 15 May 2006

Re: Team project: C or C++ Explainer-like rating program

Postby champagne » Tue Nov 30, 2010 7:03 pm

daj95376 wrote:
champagne wrote:I hope to stay with an improvement in performance somewhere between 1 to 10 and 1 to 100. The key point remains to see if ratings will match.

How about a specific performance rating (against SE) on puzzles with a 6.5 rating or lower?

Does your use of "parallel" mean the same thing as performing numerous equally-rated steps concurrently?


I did it and as far as i can remember, it was about one second to one minute, but I will rerun the benchmark to-morrow and I'll come back with fresh data carefully checked.

I intend to make a second benchmark in the area 6.6 7.0 as soon as I have a reasonable draft.

Now, what means "parallel".

Applying the tagging process, you expand the weak links in all directions At the end, you find the tags included in "nice loops" or the invalid tags.

If i say it differently in the situation of SE, this equivalent to looking in once (in parallel) for any candidate.

Expansion of weak links is very fast. (in fact, a - b is transformed in a->~b and b->~a ( using the taggging rules a -> B and b ->A) )

It means also that you get in once chains of any legnth.

Reversely, for SE, I have to check separatly the different kinds of chains X cycle, Y cycle, X chain ..... to find the lowest rating.

As I wrote above, I intend to skip that as soon as I get a rating below or equal to the current highest ER .

champagne
champagne
2017 Supporter
 
Posts: 7465
Joined: 02 August 2007
Location: France Brittany

Re: Team project: C or C++ Explainer-like rating program

Postby PIsaacson » Wed Dec 01, 2010 2:36 am

Daj,

Here's a quick benchmark using Champagne's sort list of puzzles from the Patterns Game posted results. There were 4665 puzzles with ratings of 6.5 and lower from his collection. My prototype code took less than 3 seconds and clocked over 1600 puzzles/second on an Intel core i7 860 @ 2.8 GHz. Using LK's ver 5 modified SE code and the x64 Sun Java VM, serate took 97.5 seconds and clocked about 48 puzzles/second so the C++ code is over 30x faster for these level puzzles using 64 bit code. Using the 32 bit JVM takes 3 minutes and 16 seconds which is 2x slower than than x64 Java and results in only 24 puzzles/second.
Code: Select all
$ r 691
sudoku -b -f '%g %n %.2r %.2d %.2p %.2e' < pat65.in > c:/temp/log1

Alpha (x86_64-w64-mingw32-g++) Sudoku Explainer C++ engine ver 1.0
C:\msys\home\Paul\se_cpp\sudoku.exe -b -f %g %n %.2r %.2d %.2p %.2e
4665 out of 4665 solved leaving 0 unsolved using 31 queues size 34x244 depth 4/32 avg givens 23.60 avg scores 4.10
total cpu time =        2882.8889 milliseconds
  solving rate =        1618.1685 puzzles/second
                           0.6180 msecs/puzzle

          full_house updates/calls 92984/308774        30.11% eff      46.0991 msec tot       0.0001 msec/call       0.0005 msec/update
   hidden_single_box updates/calls 142437/215790       66.01% eff      54.1547 msec tot       0.0003 msec/call       0.0004 msec/update
       hidden_single updates/calls 20486/73353         27.93% eff      20.1755 msec tot       0.0003 msec/call       0.0010 msec/update
     direct_pointing updates/calls 683/52867            1.29% eff      70.9229 msec tot       0.0013 msec/call       0.1038 msec/update
     direct_claiming updates/calls 0/52184              0.00% eff      82.7685 msec tot       0.0016 msec/call       1.#INF msec/update
  direct_hidden_pair updates/calls 7768/52184          14.89% eff     301.9797 msec tot       0.0058 msec/call       0.0389 msec/update
        naked_single updates/calls 2655/44557           5.96% eff      20.2980 msec tot       0.0005 msec/call       0.0076 msec/update
direct_hidden_triple updates/calls 752/41902            1.79% eff     248.4021 msec tot       0.0059 msec/call       0.3303 msec/update
     locked_cands_t1 updates/calls 18693/41155         45.42% eff      29.9537 msec tot       0.0007 msec/call       0.0016 msec/update
     locked_cands_t2 updates/calls 7367/22462          32.80% eff      29.3681 msec tot       0.0013 msec/call       0.0040 msec/update
          naked_pair updates/calls 12910/15095         85.53% eff      43.9078 msec tot       0.0029 msec/call       0.0034 msec/update
              x_wing updates/calls 4388/11611          37.79% eff      17.8462 msec tot       0.0015 msec/call       0.0041 msec/update
         hidden_pair updates/calls 11950/10314        115.86% eff      19.2386 msec tot       0.0019 msec/call       0.0016 msec/update
        naked_triple updates/calls 3679/7254           50.72% eff      33.9676 msec tot       0.0047 msec/call       0.0092 msec/update
           swordfish updates/calls 3936/6558           60.02% eff      15.0387 msec tot       0.0023 msec/call       0.0038 msec/update
       hidden_triple updates/calls 2177/5707           38.15% eff      17.7608 msec tot       0.0031 msec/call       0.0082 msec/update
             xy_wing updates/calls 2327/5312           43.81% eff      37.8568 msec tot       0.0071 msec/call       0.0163 msec/update
            xyz_wing updates/calls 605/3741            16.17% eff      11.9451 msec tot       0.0032 msec/call       0.0197 msec/update
           ur_type_1 updates/calls 786/3159            24.88% eff       8.9829 msec tot       0.0028 msec/call       0.0114 msec/update
           ur_type_2 updates/calls 133/2766             4.81% eff       0.4826 msec tot       0.0002 msec/call       0.0036 msec/update
           ur_type_4 updates/calls 648/2633            24.61% eff       0.7829 msec tot       0.0003 msec/call       0.0012 msec/update
           ur_type_3 updates/calls 640/2309            27.72% eff       3.2982 msec tot       0.0014 msec/call       0.0052 msec/update
      ur_loop_type_1 updates/calls 292/11871            2.46% eff     167.9249 msec tot       0.0141 msec/call       0.5751 msec/update
      ur_loop_type_2 updates/calls 168/10911            1.54% eff     699.4343 msec tot       0.0641 msec/call       4.1633 msec/update
      ur_loop_type_4 updates/calls 72/10467             0.69% eff     179.9959 msec tot       0.0172 msec/call       2.4999 msec/update
      ur_loop_type_3 updates/calls 125/10147            1.23% eff     305.6810 msec tot       0.0301 msec/call       2.4454 msec/update
          naked_quad updates/calls 667/1679            39.73% eff      10.8015 msec tot       0.0064 msec/call       0.0162 msec/update
           jellyfish updates/calls 1107/1579           70.11% eff       4.3273 msec tot       0.0027 msec/call       0.0039 msec/update
         hidden_quad updates/calls 325/1374            23.65% eff       4.8041 msec tot       0.0035 msec/call       0.0148 msec/update
          bug_type_1 updates/calls 584/1331            43.88% eff       8.9293 msec tot       0.0067 msec/call       0.0153 msec/update
          bug_type_2 updates/calls 405/1039            38.98% eff       0.3316 msec tot       0.0003 msec/call       0.0008 msec/update
          bug_type_4 updates/calls 90/859              10.48% eff       0.1820 msec tot       0.0002 msec/call       0.0020 msec/update
          bug_type_3 updates/calls 318/814             39.07% eff       0.4666 msec tot       0.0006 msec/call       0.0015 msec/update
        aligned_pair updates/calls 220/557             39.50% eff     122.6229 msec tot       0.2201 msec/call       0.5574 msec/update
             x_cycle updates/calls 484/448            108.04% eff      98.6115 msec tot       0.2201 msec/call       0.2037 msec/update
             y_cycle updates/calls 9/43                20.93% eff       3.8020 msec tot       0.0884 msec/call       0.4224 msec/update
            xy_cycle updates/calls 5/9                 55.56% eff       3.8286 msec tot       0.4254 msec/call       0.7657 msec/update


Paul@pgisaacs2 ~/se_cpp
$ r java
java64.sh pat65.in > c:/temp/log2

real    1m37.551s
user    0m0.000s
sys     0m0.000s

Paul@pgisaacs2 ~/se_cpp
$ compare c:/temp/log1 c:/temp/log2 -- my scores vs SE (mine on left SE on right for those I scored higher)
000100020003000400050020003400600010001070600000008000030080090200400500006000000 4504  6.60  6.50
001008003040000050600050100000803009009000200800607000006040001050000080400200900 4510  6.60  6.50
001000200020003040300050006000200060002070100080006000600090004010800030007000800 4516  6.60  6.50
100002030004500100060030005070000050002080003900000400020004600600900001007060080 4548  6.60  6.50
000001020003200100040050006070000010005060003300000400060008200700400008009070050 4555  6.60  6.50
001023004050600002700000000000800090030070050040002000000000006500006010600910700 4611  6.60  6.50
Run time 1) 2.75 2) 0.08

Paul@pgisaacs2 ~/se_cpp
$ compare c:/temp/log2 c:/temp/log1 -- my scores vs SE (mine on right SE on left for those SE scored higher)
000706000020030010100000005006000700050802030300000008900000004010040020008000300 4187  7.10  6.50
001002003020010400300900070006000004010090030800000900030004007004050090600700200 4258  6.00  5.60
Run time 1) 0.08 2) 2.75

Paul@pgisaacs2 ~/se_cpp
$ java32.sh pat65.in > c:/temp/log4

real    3m15.939s
user    0m0.015s
sys     0m0.015s
$

So while the C++ code is significantly faster, the bad news is that there are 6 puzzles which my code scores 0.1 points higher and 2 which score significantly lower than SE. Looking at the detail logs shows that the problems stem from how I find chains (by size) rather than by score. SE runs a technique, accumulates the potential eliminations/assignments, sorts the collection, and then selects the lowest scoring one. My code sets the chain length, finds the first qualifying chain, calculates the score and then processes it. Since the length is controlled by a for loop that starts from some specified minimum and expands the allowed length to some specified maximum and since all the chaining code uses BFS, this should very closely emulate what SE does. And that's the rub... It does closely emulate SE, but not exactly. This deviation becomes worse with the more complex chains since I use really different alogorithms which generate AICs/NLs/nrczts instead of SE's predominant use of forcing chains of various natures.

The long and short is that in order to duplicate SE's scores, I'll have to duplicate SE's algorithms in a rigourous manner. Which means adopting a very limited and somewhat arcane library of solving techniques, running each technique to completion, scoring all the potentials, then only selecting the lowest scoring ones all the while making sure the code is replete with the various quirks and oddities that SE possesses. Well, maybe it's not that bad, but I'm having a difficult time working up the necessary enthusiasm.

Cheers,
Paul

P.S. Although heavily modified from the code e-mailed to Champagne and LKSudoku, the above code contains LK's bug fixes and suggested changes merged in as best I could considering the code base deltas.
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Re: Team project: C or C++ Explainer-like rating program

Postby daj95376 » Wed Dec 01, 2010 3:57 am

Thanks Paul for the detailed comparison based on your solver!!! Thanks also for the overview on the software differences betweeen your solver and SE.

Paul wrote:SE runs a technique, accumulates the potential eliminations/assignments, sorts the collection, and then selects the lowest scoring one.

I've asked L.K. to investigate the possibility of having a version of 1.2.5.0 serate that concurrently performs all steps it has (at any given time) with the lowest rating. It's something he has on the back burner, but I'm patient and the results may prove to be interesting.

Regards, Danny
daj95376
2014 Supporter
 
Posts: 2624
Joined: 15 May 2006

Re: Team project: C or C++ Explainer-like rating program

Postby PIsaacson » Wed Dec 01, 2010 5:47 am

daj95376 wrote:Thanks Paul for the detailed comparison based on your solver!!!

Danny,

I deliberately moved the emphasis to remind myself of exactly that point. My code is based on a solver instead of a scorer, which pretty much sums up my fundamentally mistaken assumption that a scorer is just a solving machine which tracks the "hardest" move.

Cheers,
Paul
PIsaacson
 
Posts: 249
Joined: 02 July 2008

Re: Team project: C or C++ Explainer-like rating program

Postby champagne » Wed Dec 01, 2010 8:50 am

I run the Benchmark on my computer

P8600 dual core (32 bits) 2.4 GHz
I use I guess the same file as Paul


Code: Select all
SE          277 seconds
SE clone     8.5 seconds



so the ratio is the same as in Paul's test, about 1/30

some remarks;

This is done with a new lay-out, normally fitting with SE specifications
Are still missing some Bugs patterns
I have to redo all tests to finally debug that part of the program.

This will not change the order of magnitude and, moreover, the ratio for low ratings is not so important.

I'll revise these figure in due time

champagne
champagne
2017 Supporter
 
Posts: 7465
Joined: 02 August 2007
Location: France Brittany

Re: Team project: C or C++ Explainer-like rating program

Postby BryanL » Wed Dec 01, 2010 11:24 am

Surely it is too much trouble to emulate SE scoring exactly.

Why not agree to a rating methodology based on solving technique - probably similar to SE but with the few noted inconsistencies ironed out.

Once that is agreed upon it takes away so much of the time and effort Paul, lk and Champagne are putting in, allowing them and whoever to move ahead.

I see no issue with a new rating method that differs from SE. Once agreed, g.r.emlin could post (append - leaving original results in place) new ratings for all past games. And after a trial period where both sets of scores could be posted, switch to the new method.

So could we have some consensus on the idea of NOT emulating SE? A poll perhaps?

Bryan
BryanL
 
Posts: 247
Joined: 28 September 2010

Re: Team project: C or C++ Explainer-like rating program

Postby lksudoku » Wed Dec 01, 2010 11:33 am

PIsaacson wrote:There were 4665 puzzles with ratings of 6.5 and lower from his collection.

I only modified Paul's code to provide the same rating for puzzles with rating lower than 6.5, puzzles with rating 6.5 are not supposed to match and there is no point in comparing them

The reason they do not match is both due to the first found return result, and a different rating calculation

Therefore the only difference (below 6.5) is in the puzzle

001002003020010400300900070006000004010090030800000900030004007004050090600700200 4258 6.00 5.60

Which is the puzzle I mentioned in which there are 3 possible UR type 1, and choosing two of them yields 6.00 rating while choosing the third one yields a 5.60 rating

This rating difference cannot be avoided because it depends on the choice of a UR out of a list of more than one, when choosing one destroys the other (and all choices are correct for the specific step as they have the same rating)

PIsaacson wrote:So while the C++ code is significantly faster, the bad news is that there are 6 puzzles which my code scores 0.1 points higher and 2 which score significantly lower than SE.

Wrong, until rating 6.5 (not containing 6.5) the score is equal (up to one UR destroys another UR case)

For rating at 6.5 or above, code modifications are required, but as I mentioned before, as long as no executable or source code is published, I do not know if there is any point in performing these modifications

PIsaacson wrote: Looking at the detail logs shows that the problems stem from how I find chains (by size) rather than by score. SE runs a technique, accumulates the potential eliminations/assignments, sorts the collection, and then selects the lowest scoring one. My code sets the chain length, finds the first qualifying chain, calculates the score and then processes it. Since the length is controlled by a for loop that starts from some specified minimum and expands the allowed length to some specified maximum and since all the chaining code uses BFS, this should very closely emulate what SE does. And that's the rub... It does closely emulate SE, but not exactly.

This is one of the problems, another problem is that your code simply computes wrong ratings for the same chains/loops (compared to SE)

PIsaacson wrote:The long and short is that in order to duplicate SE's scores, I'll have to duplicate SE's algorithms in a rigourous manner.

not necessarily, I think that it may be that small modifications can make it match, at least for x-chains (did not check y/xy-chains), anyway, if you will calculate the score for the same chain/loop differently than SE, it does not matter how you find it, the rating will not match

As I mentioned before, if executable or sources will be published I may try to match the x-chains/loops
lksudoku
 
Posts: 90
Joined: 06 October 2010

Re: Team project: C or C++ Explainer-like rating program

Postby champagne » Wed Dec 01, 2010 1:12 pm

lksudoku wrote:As I mentioned before, if executable or sources will be published I may try to match the x-chains/loops


whatever will be the status, I promiss to publish the draft "as it will be" till the area 6.5 7.0 within one week.

Meantime, i'll clean the draft, make some debugging and prepare comments.

champagne
champagne
2017 Supporter
 
Posts: 7465
Joined: 02 August 2007
Location: France Brittany

Re: Team project: C or C++ Explainer-like rating program

Postby daj95376 » Wed Dec 01, 2010 1:49 pm

champagne wrote:I run the Benchmark on my computer

P8600 dual core (32 bits) 2.4 GHz
I use I guess the same file as Paul


Code: Select all
SE          277 seconds
SE clone     8.5 seconds



so the ratio is the same as in Paul's test, about 1/30

Thanks Champagne for posting your comparison results.

Obviously, Paul's and Champagne's efforts to produce an SE-clone is worthwhile through ratings <6.5 .

BryanL: Although I agree with you, I don't think you'll get a consensus to scrap the SE-clone effort. The objective now seems to be locked into simply making a faster version that produces identical results. I think Paul is starting to realize that the effort to redesign his software to produce identical results is approaching unrealistic. I'm still hoping that LK will find the time to tackle a version of serate that'll perform identically rated steps concurrently.

Regards, Danny
daj95376
2014 Supporter
 
Posts: 2624
Joined: 15 May 2006

Re: Team project: C or C++ Explainer-like rating program

Postby champagne » Wed Dec 01, 2010 3:08 pm

I'll try to comment on Bryanl and Danny's remarks.

One very positive point with SE is that it gives a good game.

Another one is that it finally gives for a pattern a good idea of a spread of possible puzzles.

To-day, the sluggyhness of SE is a true drawback.

I have heavy doubts that we can emulate 100% of SE (a supposed bug free version) with a significant improvement in speed.

Reversely, I am convinced that it will be possible to approach SE specifications with a performance ratio growing when the difficulty grows. As we start with a 1 to 30 ratio, we can show somehow some optimism.

On the other side, SE brings me ideas to slightly improve my own solver (at least I have the feeling that..);

So at the end, I like the challenge to set up a performing clone.

Also, I believe that if the results are not to far from SE's one, This could be accepted.

Champagne

PS in case the result woul not be accepted, I'll have still the possibility to use the clone in pre-rating and filtering
champagne
2017 Supporter
 
Posts: 7465
Joined: 02 August 2007
Location: France Brittany

Re: Team project: C or C++ Explainer-like rating program

Postby lksudoku » Wed Dec 01, 2010 3:18 pm

champagne wrote:
lksudoku wrote:As I mentioned before, if executable or sources will be published I may try to match the x-chains/loops


whatever will be the status, I promiss to publish the draft "as it will be" till the area 6.5 7.0 within one week.

Meantime, i'll clean the draft, make some debugging and prepare comments.

champagne

I was hoping for Paul's version also to be published since
  • I already modified it for <6.5 rating
  • I have some knowledge of its structure
But if your version is the only one published I may switch to it

Does your version match SE for ratings <6.5? if not additional work will be required for modifying your version to match these ratings first

It is probably better to decide on one version as base and continue from it
lksudoku
 
Posts: 90
Joined: 06 October 2010

Re: Team project: C or C++ Explainer-like rating program

Postby champagne » Wed Dec 01, 2010 4:16 pm

lksudoku wrote:I was hoping for Paul's version also to be published since
  • I already modified it for <6.5 rating
  • I have some knowledge of its structure
But if your version is the only one published I may switch to it

Does your version match SE for ratings <6.5? if not additional work will be required for modifying your version to match these ratings first

It is probably better to decide on one version as base and continue from it


I understand that Paul has not really redesigned it's solver to follow SE specifications.
I did it in a free way, but I faced problems and I made several trials (so I have also to clean the new code).

If Paul has a draft you can use, I agree that it would be better to continue with that one.

Normally the draft I'll produce will not give higher rating than SE. But it could be that it shows what I consider as bugs. I have in mind, for example the "non detection" of some hidden pairs in the UR process.

In the chain area, I am not to far from SE computation of length, but I have still some checking work due to the fact that pieces of the chains are hidden within the tagging.

I am now trying to cover the BUGs patterns I did not code to cover all the puzzles below 6.5

champagne



I
champagne
2017 Supporter
 
Posts: 7465
Joined: 02 August 2007
Location: France Brittany

Re: Team project: C or C++ Explainer-like rating program

Postby lksudoku » Wed Dec 01, 2010 4:50 pm

champagne wrote:I understand that Paul has not really redesigned it's solver to follow SE specifications.

I modified Paul's code to follow SE specification until rating <6.5, and as the above analysis shows, it matches SE version 1.2.5.0 exactly until rating <6.5 (besides cases where one UR destroys another UR and both have euqal ratings)

champagne wrote:If Paul has a draft you can use, I agree that it would be better to continue with that one.

Yes, but if the draft is internal, than why bother

champagne wrote:Normally the draft I'll produce will not give higher rating than SE. But it could be that it shows what I consider as bugs. I have in mind, for example the "non detection" of some hidden pairs in the UR process.

When comparing to SE version 1.2.5.0, all these bugs are fixed, my method involves fixing both SE and the clone, so that if SE has bugs, I create another SE version (such as version 1.2.5.0) if the clone has bugs, I fix the clone code, for now the clone matches version 1.2.5.0, if you can show SE bugs in that version I can try to fix them too, but since there is a clone match (<6.5) I assume it is not likely that there are more bugs for these ratings

champagne wrote:I am now trying to cover the BUGs patterns I did not code to cover all the puzzles below 6.5

Paul's code already covers all SE BUGs patterns

L.K.
lksudoku
 
Posts: 90
Joined: 06 October 2010

Re: Team project: C or C++ Explainer-like rating program

Postby champagne » Wed Dec 01, 2010 5:52 pm

lksudoku wrote:1) I modified Paul's code to follow SE specification until rating <6.5, and as the above analysis shows, it matches SE version 1.2.5.0 exactly until rating <6.5 (besides cases where one UR destroys another UR and both have euqal ratings)

2) Yes, but if the draft is internal, than why bother

3)When comparing to SE version 1.2.5.0, all these bugs are fixed, my method involves fixing both SE and the clone, so that if SE has bugs, I create another SE version (such as version 1.2.5.0) if the clone has bugs, I fix the clone code, for now the clone matches version 1.2.5.0, if you can show SE bugs in that version I can try to fix them too, but since there is a clone match (<6.5) I assume it is not likely that there are more bugs for these ratings

4) Paul's code already covers all SE BUGs patterns

L.K.



items 1) and 3) I got it. To know if I found other bugs (either in SE or in my draft), I would like to be sure to work with the last version of your SE fixed program. My last version is called FFIXED3

item 2) I don't catch your point. For sure, I assume the draft (which one it is) has been published.

item 4) I know also, I only think it will be faster for me to finish the draft.
I have another reason to do so. I want somehow to have, at the end, a better idea of what SE is doing.
Drafting a solution and studying what is not solved is an alternative to digging in the code.
(and at the end, I intend to bring back that code in my solver)

champagne
champagne
2017 Supporter
 
Posts: 7465
Joined: 02 August 2007
Location: France Brittany

PreviousNext

Return to Software