The New Sudoku Players' Forum

by **dobrichev** » Tue Mar 04, 2014 2:31 am

Yes, at any stage the puzzle is checked for redundant clues and this takes almost all CPU time.
My general search is based on 2 tools.
minus n < infile > outfile
plus1 < infile > outfile.single.solution 2> outfile.multiple.solutions
I am keeping track of already processed subgrids except for non-minimal puzzles.

The command currently running looks like
nohup /dobrichev/tools/plus1/plus1 > >(comm -13 ../37 - > 37.new) 2> >(comm -13 <(unzip -p ../37.doneup.2.zip) - | comm -13 <(unzip -p ../37.doneup.1.zip) - | zip -q > 37.doneup.3.zip) < <(unzip -p 36.doneup.3.zip) &

where:
../37 is a file with all known unique minimal 37-clue puzzles
37.new will store the newly found unique minimal 37-clue puzzles
../37.doneup.1.zip, ../37.doneup.2.zip are files with multiple-solution minimal 37-clue puzzles that have been processed on previous passes
37.doneup.3.zip will store the multiple-solution minimal 37-clue puzzles obtained on this pass
36.doneup.3.zip is the seed - the multiple-solution minimal 36-clue puzzles obtained from 35s on the previous run.

Beside that, I am doing {-5,+7} on the newly found 39s. The tool takes the puzzles as the only input and produces a mixture of minimal unique puzzles of different sizes that are involved later in the {-n} process. The final steps at +5 and +6 are executed in seconds. Some minimal multiple solution 40s appeared and none of the 41s.

I am scanning all valid minimal puzzles for minimal twins.
I also scanned valid puzzles of size 37+ for twins with possible redundant clues, then minimized them and involved those with 36+ givens in the {-n} process. I am not sure whether this added a value.

Storing the processed non-unique minimal puzzles and removing them from duplicate processing in the next passes reduces the count by 2% to 20% at each stage, less for the lower stages. Filtering out the processed 35s and 36s takes hours and probably is inefficient. I am trying to do this I/O intensive process in parallel with the CPU intensive {-5, +7} process.

The program code is extracted from gridchecker and is free.

by **dobrichev** » Tue Mar 04, 2014 5:56 am

This is the first discovered 40-given minimal unique Sudoku puzzle

Code: Select all: ..........12.34567.345.6182..1.582.6..86....1.2...7.5...37.5.28.8..6.7..2.7.83615

by **blue** » Tue Mar 04, 2014 7:17 am

Fantastic

Congratulations !

by **Lars Petter Endresen** » Tue Mar 04, 2014 8:04 am

Wow! Fantastic news Mladen! Congratulations!

by **Havard** » Tue Mar 04, 2014 9:21 am

Really cool! Congratulations! Now for a 41...

by **coloin** » Tue Mar 04, 2014 7:48 pm

Absolutely unbelievable.
Your puzzle indeed has 40 clues and is minimal.
Well done.

by **dobrichev** » Tue Mar 04, 2014 8:07 pm

Thank you all! This makes me happy.

I ran {-6,+7} on the first 40-given. The result is

Code: Select all: 2 40 1 39 24 38 592 37 2471 36 1025 35

where the secondary 40 is at {-1,+1} from the seed, and the only 39 is at {-4,+3} from both 40s.
From the same pass 16 new 39s appeared so far.
After these 40s somehow I lost the focus

by **eleven** » Tue Mar 04, 2014 8:22 pm

What a surprise !!

Great day.

Congratulations !!

by **coloin** » Tue Mar 04, 2014 8:58 pm

Yes - you can rest in peace now !

In another universe when 9*9 sudoku is reinvented - they won't find a 40 !

C

by **Serg** » Wed Mar 05, 2014 9:50 am

Hi, Mladen!
Outstanding discovery! Congratulations!
In what way did you find 40-clue minimal valid puzzle?

Serg

by **dobrichev** » Thu Mar 06, 2014 6:53 am

Thank you.

@eleven: No, I have no answer how many 40s there are.

You can move your graphics slightly at right.
@Serg: Methods are explained in the recent posts here. In general, a classical {-m, +n} and checking for twins is done. All puzzles are truncated to multiple-solution 35s and are checked up until there are still no redundant clues. The source code is attached.
@coloin: This is the best compliment I ever heard in my life :lol:

Thanks. But ... now what?

Havard Jun 09, 2007 wrote:It seems everyone is bending old "rules" nowadays!:) Now find me a 38!:)

ravel Jul 17, 2007 wrote:Hey, another look and here it is:...

Havard Jul 18, 2007 wrote:... But first lets find a 39!:)

Havard Aug 27, 2007 wrote:from the SudokuArchitect team: ...

dobrichev Mar 04, 2014 wrote:This is the first discovered 40-given minimal unique Sudoku puzzle ...

Havard Mar 04, 2014 wrote:Really cool! Congratulations! Now for a 41...

??? ??? wrote:Here is the 41-clue ...

This is the secondary 40-clue minimal puzzle, a close neighbour of the first one.

Code: Select all: ..........12.34567.345.6182..1.582.6..86....1.2...785...37.5.2..8..6.7..2.7.83615 40

These are the 18 39s born in the last pass. The total is 563.

Code: Select all: ...........1..2.34.23.51.67..82.7..51...48.7.7.251..83.1.62..48.841.5..62.6.84.5. ...........1..2.34.23.51.67..82.7..51...48.7.7.251..83.841....62.6.84.5.51.62..48 ...........1..2.34.23.51.67..82.7..51...84.7.7.251..83.1.62..48.841.5..62.6.48.5. ...........1..2.34.23.51.67..82.7..51...84.7.7.251..83.841....62.6.48.5.51.62..48 ...........1..2.34.34.51.2...6.1.3.2.1..7..46483.26..7.48.65.73.65..748.3.7..8.65 ...........1.23.45263.457.1......8...37.8..141.8.34.67.16..8..237..62..88.235..76 ..........12.34.56.35.2641...1.735....7.82.3132...5.8...8.4..65.6...817.15..67.48 ..........12.34.56.35.267.4.......8...3.18645.586..1.7.214....8.8..61.725.7.824.1 ..........12.34567.345.6128..1.582.6..86....1.2...7.5...37.5..2.8..6.7..2.7.83615 ..........12.34567.3526.418..1.8..4..5.47.1.6.7...6..5..76..8...2..48..118.72.654 ........1..1.23.4..24.1..35..6.7.5.3..7.6541.2.53.1.67.6...7...4.8.32.767.2.863.4 ........1..1.23.4..24.1..35..6.7.5.3..7.6541.2.53.1.67.7...6...4.8.32.766.2.873.4 ........1..1.23.4..24.1..35..6.7.5.3..7.6541.2.53.1.76.6...7...4.8.32.677.2.863.4 ........1..1.23.4..24.1..35..6.7.5.3..7.6541.2.53.1.76.7...6...4.8.32.676.2.873.4 ........1..1.23.4.254..1.36.....7....752.6.8442..8..67.173.24.83..7...1.542.18.73 ........1..1.23.4.254..1.63.....7....752.6.8442..8..76.173.24.83..7...1.542.18.37 ........1..1.23.4.452..1.36.....7....752.6.8424..8..67.173.24.83..7...1.524.18.73 ........1..1.23.4.452..1.63.....7....752.6.8424..8..76.173.24.83..7...1.524.18.37

A mirror of these puzzles is kept in https://sites.google.com/site/dobrichev/

by **dobrichev** » Sat Mar 15, 2014 5:59 am

38 new 39s from 3 passes, total 601 known 39s

Code: Select all: ...........1..2.34.3415..26..5.....2.72.1546.148.263.5.13.68.47.87..16.34...7..8. ...........1..2.34.3415..26..5.....2.72.1546.148.263.5.17.68.43.83..16.74...7..8. ...........1..2.34.3546..12.14..7.8..7.6.41.33.68...47.47.8.3..1532.6.7868.7....1 ...........1..2.34.3546..12.17..4.8..4.6.71.33.68...47.74.8.3..1532.6.7868.7....1 ...........1.23.45.2..6431...42.675..72.8.4..65..47.8..476....8216.7853.5.8..2.7. ...........1.23.45.2..6431...72.645..42.8.7..65..47.8..746....8216.7853.5.8..2.7. ...........1.23.45.24.1..63..7.52.8..581...7.21..78.54.8326...716...7..87.2.81.36 ...........1.23.45.24.153.6..72.15...52.8.7..14..57..8.751...8.216.784.34.8..2..7 ...........1.23.452645.7.13........8..28.543..483625.1..7..6.8..2.7.8.54.8.2341.7 ...........1.23456452.16..3.......78.2..875...75.312.4.47..86..2.6.7.8.55.8.62.47 ...........1.23456452.16..3.......78.2..875...75.612.4.47..83..2.3.7.8.55.8.32.47 ...........1.23456452.61..3.......78.2..875...75.162.4.47..83..2.3.7.8.55.8.32.47 ..........12.34.56.34.561.2....736...7641..8.34..68..7.873.1..51...47..84.3.857.. ..........12.34.56.35.6214......1..7.713264..3.8.9761.....7......724386.2.3.18.74 ........1..1..234..3214.56...7.8.....234.67...8.27.4.3.1..2867..786.41..26.71.83. ........1..1..234..3214.56...7.8.....234.68...8.27.4.3.1..2768..786.41..26.81.73. ........1..1..234..3214.56...7.8.....236.47...8.27.4.3.1..2867..784.61..26.71.83. ........1..1..234..3214.56...7.8.....236.48...8.27.4.3.1..2768..784.61..26.81.73. ........1..1.23.45.25.1436........7..7234.5.65.6.724.3.1.4....7.57.8163.6.8.371.. ........1..1.23.45.25.4136........7..723.45.65.6.724.3.1.4....7.57.1863.6.8.371.. ........1..1234.5.235.61.47..8..3.1..1342.78.72..1.....6.3....8.87.42.6.3.2.86.74 ........1..2..345..1345...2....2.....263.5.47.37684.25..85.6.1..6183.5.43...41.68 ........1..2..345..1345...2....2.....26375.48.386.4.25..75.6.1..6173.5.43...41.67 ........1..2..345..1345...2....2.....26378.45.385.4.26..78.5.1..8173...43.5.41.87 ........1..2.13.4..41.253.6.67.324.82.4.78...8......72.76.8.5..3.8.5..644.536.8.7 ........1..2.13.4..41.253.6.67.8.5..3.8.5..644.536.8.7.76.324.82.4.78...8......72 ........1..2.13.4..415.2.3...4.61.75.163.7482.7.2...16.25.3..64.6...5...4.7.26.53 ........1..2.1345.14.56.3.2.....17..4.6.781.571..528.4.8......323..8654.6...352.8 ........1..2.34...35.61..24.7...8..323.76..8.68.34.2.752.17..487...8.1..81.4.67.2 ........1..2.34.5..35..6.42....7..8.5.78.3..48.3461.75..8.124..2..347.183..6.8.27 ........1.12.34...5.3.16.24..7....8..584.32.723..874...81.457.232..71..87.5..81.. ........1.12.34.5..53.164.2....7.8...78..3.453.5.48..7.37..1..8.8136..745.6.87.1. ........1.12.34.5.5.3.162.4..7..85...8.34..2732..67.48.51.83.7.23..71..57.8....1. ........1.12.34.5.5.3.612.4..7..85...8.34..2732..76.48.51.83.7.23..17..57.8....1. ........1.12.34.566.31.542...13.7.45..7.48.1.3..5.12.7..6...5...2.753.64.3.4.6.9. .....1..2....3.45....4.631...3..7....67.4.23.8.2.6374..76.85124.8..7.5..2.561487. .....1..2....3.45....4.631...3..7....76.4.23.8.2.6374..67.85124.8..7.5..2.561487. .....1..2..3...45..24.536.1.....4..7.7258.14.4.81.7526..7.1....2..3.57..3.67.8215

The 36s have been excluded from the search. So far 37+ by itself generate increasing number of new 37+ puzzles at each subsequent pass.

by **tarek** » Sun Mar 16, 2014 1:18 am

Just noticed this. Great news. Well done!!!

by **dobrichev** » Sat Apr 05, 2014 8:43 pm

Thank you, tarek.

by **dobrichev** » Sat Apr 05, 2014 9:30 pm

Comparison of Intel and GNU compilers and incomplete scalability analysis

Hardware
CPUs: 2 x Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz (8 cores per CPU + hyperthreading, 20MB cache)

Intel compiler
version: icc version 14.0.0 (gcc version 4.4.7 compatibility)
command line switches: -openmp -ipo

GNU compiler
version: gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)
command line switches: -fopenmp -O3 -march=native

Tested tool
The tool does the following
- Loads a list of puzzles from a file (serial code)
- Converts them to canonical form and removes duplicates (serial code)
- Removes N givens from each puzzle in all possible ways, converts the resulting sub-puzzles to canonical form and removes duplicates (serial code)
- M times adds a clue in all possible ways to all puzzles, checks for minimality, converts resulting puzzles to canonical form, checks for unique solution, and removes duplicates (OMP parallel code)
- Stores the puzzles to a file (serial code)
It accepts N and M as parameters, loads from stdin, and stores to stdout.

The test
The code is compiled with Intel and GNU compilers into separate binaries.
The number of parallel threads is controlled by environment variable OMP_NUM_THREADS.
The timings of each run are measured once by "time" command, and once by clock() function.
All runs are done with the same parameters N=4, M=5, and on the same dataset consisting of the two known 40-given minimal sudoku puzzles.
Runs with 1,2,4,8,16,32,64 and 33 parallel threads are done for both binaries.
All runs are executed from a shell script with 5 second pause between consecutive runs.
The runs with up to 16 threads are expected to be linearly scalable since the hardware has 16 physical cores.
The runs with 32 threads are expected to give ~20% better results than those with 16 since the hyperthreading share the ALU.
The runs with 33 and 64 threads are expected to run a little slower or at the same rate as 32 threads since there are no IO operations and the hardware resources are limited to 32 threads.
The full test was performed 3 times. The timings are similar, but instead of averages, the results from a single test (the first one) are analysed. About 10 partial test have been executed in the 8-64 parallel threads area.
During the tests no other CPU consuming processes have been run.

Results
Raw results are in the table, the rows for real, user and sys times reported by OS time utility, and the clock() rows reported by the binary.
The overhead rows are calculated on the basis that the serial code takes 5 seconds and there are sufficient HW resources for each parallel thread.
(The serial code timings have not been explicitly measured but is estimated to be between 1.9 and 6.5 seconds.)
The real time Intel/GNU ration is measured by dividing the time consumed by Intel-compiled binary to the time consumed by GNU-compiled binary.

The values in the picture in textual form

Hidden Text: Show

: scalability_table.PNG (14.8 KiB) Viewed 5131 times

: scalability_pic.gif (3.93 KiB) Viewed 5131 times

Interpretation
In general Intel compiler shows stable 15% faster code.
This partially could be due to the fact that years ago heavy work have been done in the optimization of the internal loops in the solver and then Intel compiler has been used.
The only part in code where explicit difference exist (via a macro) is the unrolling of one 9-iterations loop, where it has been extensively tested that Intel compiler performs better with unrolling but GNU compiler performs better without unrolling.

At the edge of the threads (32), Intel compiler performs 22% faster, but both binaries don't work well.

The overhead is unexpectedly high, at lest for 16 threads and above.
Later attempts in minimization of the locked time have been done. They resulted in few percents better performance even for single thread, but virtually unchanged scalability.
The tested code flushes at once the puzzles suitable for output or next iteration obtained by adding a clue to a single puzzle. An experiment to flush results from 100 source puzzles at once gave worse result. Flushing any new puzzle immediately gave better result but no measurable change in the scalability.

Intel's binary performs very bad when overloaded. Even enforcing the thread count from 32 to 33 leads to disaster degrading performance almost twice. The user time increases but the boost comes from system time.
GNU performs better with more threads. User and system times slightly increase, but the real time decreases. This should be covered by the reduced idle time.
One explanation of this paradox is that compilers use different OMP libraries or at least different default values for the policies used by the library.
In only one test (out of more than 10), for 33 threads, Intel's binary performed just like the GNU one, slightly increasing user/sys but decreasing the real time.

The modern OMP runtime can monitor the CPU load and automatically reduce the load by its threads. This behavior was not in the scope of these tests.

Conclusions
Intel compiler performs 15% better on single thread and 22% better on the default number of threads = number of logical CPUs.
For the binary compiled with Intel's compiler, if the number of threads is enforced to the number of logical CPUs, any other CPU-consuming process could drastically degrade the performance.
Leaving the runtime to determine the number of parallel threads could save the process from the observed overloading disaster.
GNU compiler demonstrated better behavior in the cases where it has been simulated that the user doesn't know what he is actually doing.

The New Sudoku Players' Forum

High clue tamagotchis

Re: High clue tamagotchis

40 !!!

Re: 40 !!!

Re: 40 !!!

Re: High clue tamagotchis

Re: 40

Re: High clue tamagotchis

Re: High clue tamagotchis

40-puzzle

40-clue minimal puzzle

Re: High clue tamagotchis

563+38=601

Re: High clue tamagotchis

Re: High clue tamagotchis

Re: High clue tamagotchis