The New Sudoku Players' Forum

by **m_b_metcalf** » Wed Jun 22, 2022 12:22 pm

When comparing different solvers, or timing individual ones, there is often some discussion about the suitability of the data set used. This is most noticeable when the 17.txt file is used, as many think it's a hard test for solvers whereas, in fact, the puzzles are mostly very easy and none is very difficult.

In an attempt to pour more oil on the flames, I attach a new file of 834 puzzles. They are chosen at random from all the Patterns Games, such that all ER values are present 10 times, where possible. Thus, most ER ratings are represented by 10 puzzles each, but some rarer ones by fewer (the rarer ones being concentrated towards the end of the file).

Any solver should be able to solve all of these puzzles; the 3-micosecond solver in 3ms (more if uniqueness is not assumed).

Since the puzzles are from the Patterns Game, they are all symmetric and minimal, and have between 19 and 29 clues. They explore nearly every corner of sudoku space.

FWIW.

Mike

by **Hajime** » Wed Jun 22, 2022 3:58 pm

Total time: 00:00:02.8843780 seconds for all 834 puzzles (including uniqueness check), with my new dlx solver.
Is that a factor 1000 to slow?

m_b_metcalf wrote: ... to solve all of these puzzles; the 3-micosecond solver in 3ms...

by **m_b_metcalf** » Thu Jun 23, 2022 6:52 am

Hajime wrote:Total time: 00:00:02.8843780 seconds for all 834 puzzles (including uniqueness check), with my new dlx solver.
Is that a factor 1000 to slow?

Not if it suits your purpose. Most time spent using solvers is in checking whether a candidate puzzle has a single solution, so the important time is that to find a second solution, if present (which it isn't in this file). My singles+brute force solver gets to 1st. solutions in 1.7s for this data set, longer if uniquness not assumed.

Maybe someone who has the 3-microsecond solver could run it on this file?

Regards,

Mike

by **dobrichev** » Thu Jun 23, 2022 7:54 am

Up to the second solution.

Code: Select all: cat /proc/cpuinfo model name: Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz gridchecker --solve < quasi_uniform_834.txt > /dev/null Total time 0.046 seconds. fsss2 < quasi_uniform_834.txt > /dev/null 834+0+0 puzzles in 0.035 seconds. PencilmarkSudoku --solve --vanilla < quasi_uniform_834.txt > /dev/null Total time 0.031 seconds.

by **Mathimagics** » Thu Jun 23, 2022 9:48 am

dobrichev wrote:fsss2 < quasi_uniform_834.txt > /dev/null
834+0+0 puzzles in 0.035 seconds.

A gentle reminder to us all, that Mladen's FSSS2 solver is PDQ! 8-)

by **blue** » Thu Jun 23, 2022 11:58 am

I/O time must be influencing Mladen's timings, too.
His i5-6600 has a turbo-boost speed of 3.9GHz.

If I do this:

limit my processor speed to 3.9GHz
load the puzzles into RAM
do 100 reps through the list, and
divide solving time by 100

I get 0.0170 seconds, or (0.0161 seconds) to process the list once … depending on whether I do (or don't) ask for the solution to be recorded.

For Hajime : I have a "slightly smart" DLX solver, that takes 0.086s (@ 3.9GHz) to do the list.
If I dumb it down, it still only takes 0.130s.

Cheers.

by **tdillon** » Thu Jun 23, 2022 7:15 pm

Here are results for this dataset across a range of solvers using tdoku's benchmark program:

Code: Select all: |quasi_uniform_834.fmt | puzzles/sec| usec/puzzle| %no_guess| guesses/puzzle| |--------------------------------------|------------:|------------:|-----------:|---------------:| |minisat_minimal S/s...+./m+| 366.5 | 2,728.5 | 0.0% | 309.62 | |minisat_natural S/s...+./m+| 381.1 | 2,624.1 | 0.0% | 285.63 | |minisat_complete S/sh..+./m+| 1,904.5 | 525.1 | 3.1% | 37.38 | |minisat_augmented S/shrc+./m+| 1,458.1 | 685.8 | 5.7% | 29.34 | |_tdev_dpll_triad_scc_ih S/shrc++/m+| 2,019.9 | 495.1 | 23.6% | 5.85 | |lhl_sudoku G/s...../m.| 6,685.6 | 149.6 | 0.0% | 268.60 | |_tdev_basic_heuristic G/s...../m.| 10,557.5 | 94.7 | 0.0% | 292.49 | |zerodoku G/sh..../m.| 14,659.2 | 68.2 | 3.3% | 27.33 | |fast_solv_9r2 E/sh..../m.| 10,840.2 | 92.2 | 3.3% | 26.04 | |kudoku E/sh..../m.| 11,579.6 | 86.4 | 3.3% | 22.70 | |norvig C/sh..../m.| 6,028.4 | 165.9 | 3.4% | 27.57 | |bb_sudoku C/shrc../m.| 27,794.9 | 36.0 | 5.6% | 36.77 | |fsss C/shrc../m.| 32,108.2 | 31.1 | 5.6% | 19.66 | |jsolve C/shrc../m.| 33,215.5 | 30.1 | 5.6% | 18.38 | |fsss2_locked D/shrc../m.| 36,768.0 | 27.2 | 5.6% | 19.34 | |fsss2 D/sh..../m.| 54,460.8 | 18.4 | 3.3% | 23.22 | |jczsolve B/shr.../m.| 57,504.4 | 17.4 | 4.3% | 26.38 | |sk_bforce2 B/shrc-./m+| 58,558.6 | 17.1 | 4.9% | 21.63 | |rust_sudoku B/shr.../m.| 69,744.9 | 14.3 | 4.3% | 25.58 | |tdoku T/shrc+./m+| 88,578.9 | 11.3 | 8.5% | 13.03 |

The cryptic code following each solver's name is to help group solvers by their similarities and differences. It indicates the primary representation (SAT, Groups, Exact Cover, Cells, Digits, Bands, or Tdoku's thing), as well as supported inference (singles, hidden singles, locked candidates, + for something extra) and supported heuristics (generally min-remaining-values, or + for something extra).

The benchmark program does a couple of things to try to make comparisons as useful as possible:

makes sure every solver is operating in the same mode (count up to 2 solutions)
builds an in-memory dataset before benchmarking so I/O is out of the picture
randomly permutes the puzzles so benchmarking is not biased by presentation (e.g., as would occur if using a minlex dataset to benchmark a solver whose heuristics were tuned on minlex datasets).
runs for a while before benchmarking to warm cpu caches and branch predictors
runs for an integral number of passes over the dataset, and at least for a few seconds. long enough to minimize timing variance, and without bias in which puzzles were solved for slow solvers.

Of course, if you really want to compare solvers it's best to give each one the benefit of whatever compiler produces the fastest code for it. I didn't do that here. These results are clang-11 on an i7-1065G7 (a platform that's pretty favorable for tdoku relatively speaking).
I also used to turn off ALSR and pin to a single core that's otherwise excluded from scheduling, which I didn't do here either, those are aimed more at reducing variance than bias.

by **m_b_metcalf** » Fri Jun 24, 2022 8:15 am

Lots of intersesting results. Thanks.

It did occur to me that we make these timings on valid puzzles, always looking for a second solution that never exists. I attached a new file of the same 834 puzzles, but each with one clue removed, at random. In these puzzles, with bewtween 18 and 28 clues, a second solution does exist. This is the situation we have when testing that a puzzle is minimal. The surprise for me was that my humble code finds that second solution twice as fast as it finds the first solution in the original file.

Curious.

Mike

by **Hajime** » Fri Jun 24, 2022 9:29 am

Without I/O my dlx solver is a bit faster, but not as near as all those turbo solvers. Impressive.

m_b_metcalf wrote:The surprise for me was that my humble code finds that second solution twice as fast as it finds the first solution in the original file.
Curious.

Not so curious: If each puzzle has, lets say, 100 solutions and you stop searching after the 2nd solution, the total time needed is much smaller.
I did a full search for all the solutions. See file for result.
The first puzzle noted: 15 1=1270/2=60/ which means 15 solutions 1270 naked/hidden clues processed and 60 times a guess between 2 candidates in 1 cell or 2 same candidates in a house was needed.
Of all the puzzles only one puzzle (#617) needs a guess in 3 cells/houses to find all the 1078 solutions ( 1078 1=12503/2=1128/3=8/).
In the quasi_uniform_834.txt all puzzles are solved with 2 candidates in a cell or 2 same candidates in a house (row/col/box) and no "/3" is needed.

The New Sudoku Players' Forum

Benchmark data set

Benchmark data set

Re: Benchmark data set

Re: Benchmark data set

Re: Benchmark data set

Re: Benchmark data set

Re: Benchmark data set

Re: Benchmark data set

Re: Benchmark data set

Re: Benchmark data set