## New 225x225 massive sudoku

For fans of Killer Sudoku, Samurai Sudoku and other variants

### New 225x225 massive sudoku

blue wrote:Give SAT a try, if you like.
I used it with Mike Metcalf's 225x225 sudoku(s?), and it worked well.

However, the puzzle in question, from 2009, was lost when the Sudoku Programmers' Forum disappeared. In case it's of any further interest, I've used my (now faster) program to remove a few more clues, the result is attached. It's solvable with basic methods, but a slog.

Regards,

Mike Metcalf
Attachments
225x225_new.txt

m_b_metcalf
2017 Supporter

Posts: 13519
Joined: 15 May 2006
Location: Berlin

### Re: New 225x225 massive sudoku

My SAT-solver now works for (in principle) arbitrary large grids. I tested it with this puzzle and it took 9 seconds to solve. Simplification with naked and hidden singles pairs and triples and row/column block interaction takes about 6 seconds, the SAT part with 549290 clauses remaining takes about 3 seconds using sat4j. I observed that using naked and hidden quadrupels or higher tuples in the simplification phase takes very long, so I did not use these here.
hkociemba

Posts: 66
Joined: 09 July 2017

### Re: New 225x225 massive sudoku

Addition: Allowing naked and hidden quadrupels in the simplification phase completely solve the puzzle without making the SAT phase necessary. This takes about 20 minutes with my solver which of course is much more than the 9 seconds before. Btw, using the approach described in https://www.cs.cmu.edu/~wklieber/papers/2007_efficient-cnf-encoding-for-selecting-1.pdf dramatically slowed down the SAT-part, so I only use the obvious approach without commander-variables.
hkociemba

Posts: 66
Joined: 09 July 2017

### re: New 225x225 massive sudoku

hkociemba wrote:
hkociemba wrote:
Simplification with
naked and hidden singles pairs and triples
and row/column block interaction
takes 6 seconds

Allowing naked and hidden quadrupels in the simplification phase
completely solve the puzzle.

This takes 20 minutes with my solver

quadrupels are indeed expensive --
but putting them as the last level
+ checking progress after each level
should make the expense tolerable,
hopefully far below 20 minutes,
try it ?

splitting your until into 2 wrote:
checking progress after each level: Show

LEVELS := 5

repeat

level := 0 ;
progress := n_cand_set ;

repeat

level := level + 1 ;
if( level > LEVELS ) return( heavy_tool() ) ;
solve( level ) ;
until( n_cand_set != progress ) ;
until( n_cand_set = DIM3 ) ;
return( SOLVED ) ;

Pat

Posts: 4056
Joined: 18 July 2005

### Re: New 225x225 massive sudoku

Wow, instead of 20 min It takes 11 s now to completely solve the puzzle without the SAT solver and only using standard methods including quadruples. Thanks Pat!
This is the original code:
Hidden Text: Show
Code: Select all
` repeat  restart:    progress := n_cand_set;    if CheckHiddenSingles.Checked then    begin      hidden_single_block;      hidden_single_row;      hidden_single_col;    end;    if n_cand_set > progress then      goto restart;    if CheckNakedSingles.Checked then      naked_single;    if n_cand_set > progress then      goto restart;    if CheckBlockLine.Checked then    begin      block_candidates_in_row;      block_candidates_in_col;      row_candidates_in_block;      col_candidates_in_block;    end;    if n_cand_set > progress then      goto restart;    for i := 2 to Form1.SEMaxTupel.Value do    begin      if CheckHiddenTuple.Checked then      begin        hidden_tuple_block(i);        hidden_tuple_row(i);        hidden_tuple_col(i);      end;      if CheckNakedTuple.Checked then      begin        naked_tuple_row(i);        naked_tuple_col(i);        naked_tuple_block(i);      end;      if n_cand_set > progress then        goto restart;    end;  until (n_cand_set = progress) or (n_cand_set = DIM3);`
Comments on the code if another order of the methods is better are welcome . My goal now is to make this puzzle quadruple optimal. Since removing an entry and checking if the puzzle still is solvable with quadruples now takes only about 10 s checking should be done in about 100 hours. From the first 100 cells I checked 97 could be removed so I think that many more cells can be removed.

Edit: 662 entries removed the checking time now has increased to about 45 s/entry. So it will be unfeasible to completely reduce the puzzle. Reducing the number of givens from 38802 below 38000 should however be possible within a couple of hours.
Edit 2: After 17 h of CPU-time amd checking about 6% of all entries I am down to 37500 givens. Meanwhile checking a new entry shows it to be unremovable in more than 50% which is much more than the 3% in the beginning. Since in the case of unremovibility the result shows up in a few seconds in most times I see again a chance to check all entries within a reasonable time. Checking time for removable entries is about 55 s meanwhile - time seems to increase at most linear with the number of removed entries.
hkociemba

Posts: 66
Joined: 09 July 2017

### Re: New 225x225 massive sudoku

With about 65 h of CPU-time i could reduce the puzzle from 38802 to 36192 givens such that the puzzle still is solvable using simple methods up to quadruples. The solution takes about 20000 steps! The puzzle still is not quadruple-minimal; I estimate that the quadruple-minimal puzzle has less than 36000 givens. Puzzle and solving steps are appended, I had to zip the file - else it would exceed the upload limit of 256 kb. Btw., after removing the 2610 givens my SAT solver takes now more than 1 hour to find the solution, the standard method solver takes about 2 minutes.
Attachments
225x225_new_36192_solving.zip
225x225_new_36192.txt
hkociemba

Posts: 66
Joined: 09 July 2017

### Re: New 225x225 massive sudoku

I was able to reduce the puzzle to a minimal version now within a bout 10 CPU-days. Any removal of an entry makes it unsolvable using exactly the simple methods singles, pairs, triples, quads and pointing/claiming. Almost 5000 entries have been removed, there are 33928 givens left. While the original version took "only" 14635 steps with my solver, the minimal version needs 22649 steps until the puzzle is solved. Solving time is now about 3 minutes with the basic method solver. Using the SAT-solver instead no solution was found within 5 hours now.
It do not think I invest much more time in 225x225 sudokus since there are probably only a few people around which are interested and have the possibility to solve these.
Attachments
hkociemba

Posts: 66
Joined: 09 July 2017

### Re: New 225x225 massive sudoku

Hi hkociemba,

hkociemba wrote:It do not think I invest much more time in 225x225 sudokus since there are probably only a few people around which are interested and have the possibility to solve these.

Count me there.
I have no possibility to solve large puzzles, neither I will try them by hand.
My experience in 9x9 puzzles shows that direct eliminations + backtracking beats any other technuques in effectiveness, including pairs.
I am curious in scalability of the three different approaches - direct eliminations, other relatively simple techniques like pairs and triplets, and SAT solver.
Also I have no idea for large puzzles how fast the pencilmarks table is emptied and whether it is more suitable to use other data structures like lists.

I can't contribute here but follow your progess with interest.

Cheers,
MD
dobrichev
2016 Supporter

Posts: 1841
Joined: 24 May 2010

### Re: New 225x225 massive sudoku

dobrichev wrote:My experience in 9x9 puzzles shows that direct eliminations + backtracking beats any other technuques in effectiveness, including pairs.
I am curious in scalability of the three different approaches - direct eliminations, other relatively simple techniques like pairs and triplets, and SAT solver.
Also I have no idea for large puzzles how fast the pencilmarks table is emptied and whether it is more suitable to use other data structures like lists.

I cannot prove it but I suspect that backtracking is inferior to SAT for large puzzles. To make the SAT solver effective I of course eliminate as many candidates (=pencilmarks) as possible using a given set of basic techniques before setting up the clauses. For a 225x225 puzzle there are in principle up to 225^3=11390625 candidates. The original 225x225_new.txt puzzle has 103384 candidates left at the beginning and for example 63091 candidates left after applying singles/pointing/claiming.
The quadminimal reduced version has 227658 candidates left at the beginning, 212742 candidates left after applying singles/pointing/claiming and 211495 candidates left after singles/pointing/claiming/pairs/triplets - so the basic methods up to triplets have no big effect. If you add quads then the number of candidates left gets 0 - the puzzle is solved. This does not mean that quads have a big effect in general it is just this special example which was constructed to be as hard as possible to be solvable with basic methods including quads. The SAT solver seems to fail when trying to solve this minimal puzzle - no result within 14 hours.
hkociemba

Posts: 66
Joined: 09 July 2017

### Re: New 225x225 massive sudoku

Thank you for these numbers.
Are you sending only the direct contradiction clauses to SAT? If so, you may try to bombard it additionally with clauses that simulate some of the basic techniques.
dobrichev
2016 Supporter

Posts: 1841
Joined: 24 May 2010

### Re: New 225x225 massive sudoku

dobrichev wrote:Thank you for these numbers.
Are you sending only the direct contradiction clauses to SAT? If so, you may try to bombard it additionally with clauses that simulate some of the basic techniques.

I do not really understand your question concerning the contradiction. The variables of the SAT clauses are the candidates/pencilmarks of not yet filled cells and the givens. The variables corresponding to the givens are set to true while with the variables belonging to the candidates we construct clauses for each cell, each row, each column and each block which describe that

1.in each unfilled cell at least one candidate is true and at most one candidate is true
2 in each row/column block for each number at least one candidate ist true and at most one candidate ist true.
If for example in cell C there are 3 candidates 2, 4, 7 left corresponding to the variables a, b, c we have the 4 clauses (a or b or c), (not a or not b), (not a or not c), (not b or not c).
If for example in row R the candidates for 7 are exactly in column 5 and column 6 corresponding to variables d and e we add the clauses (d or e), (not d or not e).

In DIMACS CNF format (which is the input format for a SAT solver) the variables are coded with numbers and for a given puzzle size I code a candidate/given in row r and column c and value n (r, c internally zero based) by r * S^2 + c * S + n where S is the number of cells in a row/column/block. So the "variable names" range from 1..S^3, but only fewer variables really are used, depending on the number of candidates left.
This all is quite straighforward and I do not think it is useful to simulate basic techniques within SAT. I apply the basic techniques before I do the SAT part and try to eliminate as many candidates as possible there.
hkociemba

Posts: 66
Joined: 09 July 2017

### Re: New 225x225 massive sudoku

To be crear, I never used SAT solvers.
The idea was to introduce, for example, additional variables for pairs. I don't know wether this would improve the solving process. Even if these additional clauses are pure tautology (are they?) there is a nonzero chance for SAT to choose a shorter path having them in hands.

Applying simple technuques after a more complex elimination is reasonable IMO. In your example, after SAT identifies an elimination equivalent to quadruple (I know that this is inappropriate and you can't intercept such event), applying once more the zoo of the more basic techniques is expected to cause series of additional cheaper eliminations. That said, if your final solver (SAT) is aware of some simple techniques then in theory it could use their power at later stages.
dobrichev
2016 Supporter

Posts: 1841
Joined: 24 May 2010

### Re: New 225x225 massive sudoku

hkociemba wrote:... and 211495 candidates left after singles/pointing/claiming/pairs/triplets -

Herbert, thanks for this new version. The good news is that if I, too, apply the self-same techniques to the puzzle, I also get precisely 211495 candidates left. I also tried additionally applying naked quads (I don't have hidden quads implemented) but that is not sufficient to solve it completely and my back-up code grinds away forever.

I'll leave it at that, but might look into problem144.txt again when I get time.

Regards,

Mike Metcalf

m_b_metcalf
2017 Supporter

Posts: 13519
Joined: 15 May 2006
Location: Berlin

### Re: New 225x225 massive sudoku

dobrichev wrote:Applying simple technuques after a more complex elimination is reasonable IMO. In your example, after SAT identifies an elimination equivalent to quadruple (I know that this is inappropriate and you can't intercept such event), applying once more the zoo of the more basic techniques is expected to cause series of additional cheaper eliminations. That said, if your final solver (SAT) is aware of some simple techniques then in theory it could use their power at later stages.

This sounds plausible but I have not really an idea how to implement the equivalent of more complex eliminations as clauses in conjunctive normal form. Or do I miss something here?
hkociemba

Posts: 66
Joined: 09 July 2017

### Re: New 225x225 massive sudoku

My solver solved this thing, it took a long time, 30 minutes or something.
It was created to support many variants and strategies, not created for speed.
It returns all possible exclusion for every strategies, finding sets larger than quadrupels is not that expensive.
How expensive are quadrupels in you solver, can you do a benchmark?

Are there any sudoku bigger available?
I could generate one and reduce it to singles only in a couple of minutes.
creint

Posts: 374
Joined: 20 January 2018

Next