The New Sudoku Players' Forum

by **champagne** » Thu Aug 13, 2015 4:05 pm

rjamil wrote:Have anybody ever try to use new values of TblMaskDouble[512] array as suggested by zhouyundong_2012 on Fri May 23, 2014 at 4:59 am and 5:14 am?

Code: Select all
int TblMaskDouble[512] = { 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ... };

R. Jamil

several versions of the process have been published.

I use I think the final one where, in the Update() you find for each block in a #define something as

F[K]&=TblMaskSingle[S]&TblMaskDouble[C1];\
F[J]&=TblMaskSingle[S]&TblMaskDouble[C2];\

here, the value in TblMaskDouble can not pollute the bits 24_26, so I have a completely different table. (more or less the complement to bits 0_26)

by **Afmob** » Thu Aug 13, 2015 5:39 pm

On another note, is there a way to eliminate a candidate of a non-given cell before calling the solving routine? This would be really useful for checking (weak or strong) minimality of a given puzzle. It's easy to do in JSolve but I haven't quite grasped ZhouSolve enough to modify Jason's version to do this.

by **JasonLion** » Thu Aug 13, 2015 5:59 pm

This should get you started:

G is an array containing a stack of board states.
G[Index] contains the current state.
G[Index].F[27] is an array of bit maps, three for each digit
Each element of F contains a bit map that is 1 for each of 27 cells (a band) that can contain the associated digit
there are also some status bits stored in the left over high order bits
note that bits are set for both solved and unsolved digits (unlike JSolve)

Hope that helps

by **Afmob** » Thu Aug 13, 2015 8:26 pm

Thank you! It's a start but the code is still pretty confusing especially since the variable names are not as self-explanatory as JSolve ones.

It seems I have to modify InitSudoku() (and the functions that call it) and unset the concerning bit in F. But what about BlockMask and BlockMaskSum? Do they also have to be modified?

And just to make it clear, if you want to set R4C3 <> 6 you have to unset the 3rd bit of F[16]?

by **JasonLion** » Thu Aug 13, 2015 8:52 pm

You are definitely on the right track.

Don't worry about BlockMask or BlockMaskSum. They are involved with dealing with clearing out possibilities due to set digits, and you are only dealing with pencil marks.

Yes 3rd bit (mask 0x4) of F[16].

by **Afmob** » Fri Aug 14, 2015 7:44 am

Thanks Jason! I modified InitSudoku and got it to work. My first tests seem to indicate that this speeds up checking for (weak) minimality by more than 130% (11.2 seconds with JSolve, 4.8 seconds with ZhouSolver for 100,000 sudokus).

by **dobrichev** » Sat Aug 29, 2015 11:20 am

champagne wrote:I looked again the brute force code to see what could be done using a wider bit field.
...
A 64 bit field can be used instead of the 32 bit field keeping untouched the zhouyundong_2012 process basics.
...

Hi Champagne,

I finally tested your latest modification of Zsolver (Zhou_useV2.zip which you have sent me in private communication).
I have ignored the sktest.cpp file and incorporated the solver into my elementary test loop.
Below are the results of Zsolver 32 and 64-bit versions, and current version of fsss2 for comparison.
All are tested on the same machine with the same g++ 64-bit compiler and flags, and after profiling.

zh32
0+48826+0 puzzles in 0.211 seconds.
0+29301+0 puzzles in 0.881 seconds.
0+50000+0 puzzles in 0.328 seconds.
0+6000+0 puzzles in 0.673 seconds.
486451+763+12786 puzzles in 0.287 seconds.
0+1763344+0 puzzles in 9.358 seconds.
0+817681+0 puzzles in 99.079 seconds.

zh64
0+48826+0 puzzles in 0.217 seconds.
0+29301+0 puzzles in 0.931 seconds.
0+50000+0 puzzles in 0.346 seconds.
0+6000+0 puzzles in 0.705 seconds.
486451+763+12786 puzzles in 0.289 seconds.
0+1763344+0 puzzles in 10.096 seconds.
0+817681+0 puzzles in 104.193 seconds.

fsss2
0+48826+0 puzzles in 0.228 seconds.
0+29301+0 puzzles in 0.719 seconds.
0+50000+0 puzzles in 0.247 seconds.
0+6000+0 puzzles in 0.327 seconds.
486451+763+12786 puzzles in 0.179 seconds.
0+1763344+0 puzzles in 5.978 seconds.
0+817681+0 puzzles in 61.477 seconds.

These are timings from my old test over Zsolver's code published by Jason.
0,203
0,784
0,321
0,527
0,664
???
85,800

The 5th row is for the Jason's collection of half million random subgrids where invalid ones dominate. Your modification works drastically better with them (287 vs 664), so your optimization task is well done.
The 64-bit version is slower, and for the valid puzzles your modification is a bit slower than the original.

The testing collections have been discussed somewhere (maybe in the former programmers forum) and in general are:
- the 17-givens puzzles (not all of them)
- a collection of 20 times 1465 hard puzzles
- a collection of 5000 hard puzzles
- Tarek's pearly 6000 hard puzzles
- Jason's random subgrids
- a collection of 38-given puzzles (relatively hard)
- Champagne's collection of (some of the) hardest puzzles

Cheers,
MD

by **champagne** » Sat Aug 29, 2015 3:12 pm

dobrichev wrote:I finally tested your latest modification of Zsolver (Zhou_useV2.zip which you have sent me in private communication).
I have ignored the sktest.cpp file and incorporated the solver into my elementary test loop.
Below are the results of Zsolver 32 and 64-bit versions, and current version of fsss2 for comparison.
All are tested on the same machine with the same g++ 64-bit compiler and flags, and after profiling.

Cheers,
MD

Hi Mladen,

Well done.

Your results confirm my own tests, what I didn't have was the comparison with fss2 and Jason's version of the Zhou solver.

Clearly, fss2 remains better by say 30% (may be more) against my own version of the Zhou solver.

That's an important point for some tasks. Is there now a version that you can include in Microsoft visual C++?

BTW I knew that the code added to filter the "worst cases" was a penalty for valid puzzles. It can be easily omitted .. if you don't need it.

Cheers

champagne

by **dobrichev** » Sat Aug 29, 2015 3:58 pm

The earlier determination of invalidity is essential. I can't imagine who would need a fast solver to solve puzzles previously known to be valid.

champagne wrote:Is there now a version [of fsss2] that you can include in Microsoft visual C++?

Did you ever tried the version in https://github.com/dobrichev/fsss2? It should work with 64-bit MS compiler, at least after few minor modifications if I messed up the compatibility in the latest editions. It has a tag from 12 Nov 2014 saying "Compiles with MS VS 2010".

by **champagne** » Sat Aug 29, 2015 4:24 pm

dobrichev wrote:The earlier determination of invalidity is essential. I can't imagine who would need a fast solver to solve puzzles previously known to be valid.

generally speaking, that's right, however, in some cases, the question is "is the solution unique". This is always the case for example when you check if the puzzle is minimal , but also "nearly the case" in the X+Y+27 generation

dobrichev wrote:
champagne wrote:Is there now a version [of fsss2] that you can include in Microsoft visual C++?

Did you ever tried the version in https://github.com/dobrichev/fsss2? It should work with 64-bit MS compiler, at least after few minor modifications if I messed up the compatibility in the latest editions. It has a tag from 12 Nov 2014 saying "Compiles with MS VS 2010".

I'll check that later in September and will give you a feedback

by **dobrichev** » Sat Aug 29, 2015 5:01 pm

champagne wrote:...when you check if the puzzle is minimal...

Below are the fsss2 timings for "weak minimality" check on the above puzzle sets, according to Afmob's definition earlier in this thread.

48826+0+0 puzzles in 4.074 seconds.
28521+780+0 puzzles in 3.113 seconds.
19950+30050+0 puzzles in 2.572 seconds.
6000+0+0 puzzles in 0.794 seconds.
5646+494354+0 puzzles in 0.919 seconds.
1763344+0+0 puzzles in 129.477 seconds.
817664+17+0 puzzles in 138.633 seconds.

The first number is the count of the puzzles having no redundant clue (even if they have multiple solutions), the second column counts invalid puzzles and those with at least one redundant clue.

by **rjamil** » Mon Aug 31, 2015 10:21 pm

Hi champagne,

Hope you will share your latest Zhou_useV2.zip file with us here.

I need to know that how you change TblMaskDouble[512] array values since Zhou_use.zip file contain exactly same values as ZSolver1.0.zip file (except in Oct instead of Hex).

Or may be I missed something else.

R. Jamil

by **champagne** » Wed Sep 02, 2015 3:53 pm

rjamil wrote:Hi champagne,

Hope you will share your latest Zhou_useV2.zip file with us here.

I need to know that how you change TblMaskDouble[512] array values since Zhou_use.zip file contain exactly same values as ZSolver1.0.zip file (except in Oct instead of Hex).

Or may be I missed something else.

R. Jamil

Hi Jamil,

I am currently far from home, but I can give you a first answer.

First of all, I have no problem to share any code.
In the case of that process, To be sure to understand what was behind the tables, I wrote at the very beginning a generation of the tables with my own understanding of the content and I compared the results with the original table.

I accepted deviations when they had IMO no effect on the process.

I have to verify, but I think that I uploaded the code I used for the generation of tables. If not, I can easily do it.

As I don't know exactly where you are and what you need, it is difficult to send the most appropriate answer. We could share more information through pm.

BTW, I had in mind deeper changes in the lay-out of the process and I found in fss2 (see mladen Dobrichev recent comments link) several point matching such ideas and sometimes very efficient answers to some pending questions.

using fss2 after necessary adjustments to have it working under microsoft visual C++ is another solution to consider and so far seems the best one if you really need a fast process. I started that task.

by **champagne** » Tue Sep 22, 2015 8:28 am

Some preliminary results about a new version of the zhouyundong_2012 brute force

I made 2 benchmarks against fss2 with that code and got the following results

a file of 48.087 million puzzles of 24 clues rating 10.0 and more
39mn 35s using fss2
27mn 58s using that new code

a sample file of 33619 puzzles out of the pattern game run 100 times
18 seconds using fss2
14 seconds using that new code.

The new code is a mix of a new lay-out of the zhouyundong_2012 brute force
and of a process derived from fss2

More tests will be run in the near future by mladen Dobrichev and possibly this will induce some improvements in fss2 as well.

Some comments about the new code

Lay out of the ZHOU4 version of the zhouyundong_2012 original code

in the zhouyundong_2012 brute force, the elementary description of the puzzle is the status of candidates in a band

this is a 32 bit field (F) specific to the digit 'd'.

Code: Select all: positions 0-26 correspond to the 27 cells of the band. the bit is '1' is the candidate is still possible, so the initial status is #define BIT_SET_27 0777777777 an assign candidate remains in (F) positions 27-29 have one bit per row bit 27 => cells 0-8 bit 28 => cells 9-17 bit 29 => cells 18-27 the bit is '1' as long as the row has not been identified as "assigned" for that digit the bit is switched to 0 for initial given in the "Update" function for singles not yet assigned in the guess process

in the new lay out,

Code: Select all: - the 3 bands of a specific digit are located in a 128 bit field, except for the 3 bits (26;27;28) - a status of unsolved cells is kept in a separate 128 bits field also split in 3 bands (Cells_unsolved) - bits 26,27,28 of the original lay_out (unsolved rows in a band) are grouped in another 128 bits field "Rows_unsolved" where the 9 rows of a digits are arranged in 3 bands as well (3 digits per band

This changes slightly the Initial and Update process, but the main change compared to the first version is in the "post update" and "guess" process.

Here is the key sequence derived from some code I found in fss2

Code: Select all: #define NAKED(X) Map=map[X];R3|=R2&Map;R2|=R1&Map;R1|=Map; int ZHOU4::ApplyNakedOrEmptyCells(int i64){ unsigned _int64 * map = &FD[0].bitmap128.m128i_u64[i64]; unsigned _int64 unsolved = Cells_unsolved.bitmap128.m128i_u64[i64]; register unsigned _int64 R2 = map[0] & map[2], R1 = (map[0] | map[2]), Map,R3;// digits 12 Map = map[4]; R3 = R2&Map; R2 |= R1&Map; R1 |= Map; NAKED(6) NAKED(8) NAKED(10) NAKED(12) NAKED(14) NAKED(16) // digits 3-9 if (unsolved & (~ R1)) return 1; // locked R1 &= ~R2; R1 &= unsolved; // forget solved seen as singles if (R1) glb.single_applied = 1; else{ glb.pairs.bitmap128.m128i_u64[i64] = R2& (~R3); return 0; }

This is a parallel search of empty cells, singles in cell and naked pairs in the current map for the first 2 bands.
And using the intrinsic _BitScanForward64, pointing the adequate cells is very fast and easy.

At the end,nearly all guesses are made out of naked pairs against hidden pairs in the former version

by **champagne** » Mon Sep 28, 2015 3:59 pm

After the last post I made more tests and I prepared a special brute force for the specific X+Y+27 generation.
I confirm that as it is that process seems to be about 30% faster than fsss2, the best known brute force.
The X+Y+27 process is much faster than the classical one.

I loaded a zip file on my site to deliver the final result

sources
skbf.exe a 64 bit exec file to test the process
a documentation using HMTL pages
sample files for a first set of test

the zip file is skbf_V0

The New Sudoku Players' Forum

3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)

Re: 3.77us Solver(2.8G CPU, TestCase:17Sodoku)